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Preface 



This databook covers the Nx586™ processor (called the processor). The databook is written for 
system designers considering the use of these devices in their designs. We assume an experienced 
audience, familiar not only with system design conventions but also with the x86 architecture. The 
Glossary at the end of the book defines NexGen’s terminology, and the Index gives quick access to 
the subject matter. 

NexGen’s Applications Engineering Department welcomes your questions and will be glad to 
provide assistance. In particular, they can recommend system parts that have been tested and proven 
to work with NexGen™ products. 



Notation 

The following notation and conventions are used in this book: 

Devices and Bus Names 

■ Processor or CPU — The Nx586 processor described in this book. 

■ NxVL™ Systems Logic — The NxVL system controller described in the NxVL System 
Controller Databook. 

■ NxPCI™ Systems Logic — The NxPCI system controller described in the NxPCI System 
Controller Databook. 

■ NxMC™ Memory Logic — The NxMC memory controller described in the NxMC Memory 
Controller Databook. 

■ NexBus^ System Bus — The NexGen system bus, including its multiplexed address/status 
and data bus (NxAD<63:0>) and related control signals. 

■ NexBus Processor Bus — The Nx586 processor bus, including its multiplexed address/status 
and data bus (AD<63:0>) and related control signals. 

Signals and Timing Diagrams 

■ Active-Low Signals — Signal names that are followed by an asterisk, such as ALE*, indicate 
active-low signals. They are said to be "asserted" or "active" in their low-voltage state and 
"negated" or "inactive" in their high-voltage state. 
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■ Active-High Signals — Signal names, such as GALE, that indicate active-high signals. They 
are said to be "asserted" or "active" in their high-voltage state and "negated" or "inactive" in 
their low-voltage state. 

■ Bus Signals — In signal names, the notation <n:m> represents bits n through m of a bus. 

“ Reserved Bits and Signals — Signals or bus bits marked “reserved” must be driven inactive 
or left unconnected, as indicated in the signal descriptions. These bits and signals are 
reserved by NexGen for future implementations. When software reads registers with 
reserved bits, the reserved bits must be masked. When software writes such registers, it must 
first read the register and change only the non-reserved bits before writing back to the 
register. 

■ Source — In timing diagrams, the left-hand column indicates the "Source" of each signal. 
This is the chip or logic that outputs the signal. When signals are driven by multiple sources, 
all sources are shown, in the order in which they drive the signal. In some cases, signals take 
on different names as outputs are logically ORed in group-signal logic. 

■ Tri-state® — In timing diagrams, signal ranges that are high impedance are shown as a 
straight horizontal line half-way between the high and low level. 

■ Invalid and Don’t Care — In timing diagrams, signal ranges that are invalid or don't care are 
filled with a screen pattern. 

Data 

■ Quantities — A word is two bytes (16 bits), a dword or doubleword is four bytes (32 bits), 
and a qword or quad word is eight bytes (64 bits). 

■ Addressing — Memory is addressed as a series of bytes on eight-byte (64-bit) boundaries, in 
which each byte can be separately enabled. 

■ Abbreviations — The following notation is used for bits and bytes: 



Bits 


b 


as in 


“64b/qword” 


Bytes 


B 


as in 


“32B/block” 


kilo 


k 


as in 


“4kB/page” 


Mega 


M 


as in 


“IMb/sec” 


Giga 


G 


as in 


“4GB of memory space’ 



■ Little Endian Convention — The byte with the address xx...xx00 is in the least-significant 
byte position (little end). In byte diagrams, bit positions are numbered from right to left: the 
little end is on the right and the big end is on the left. Data structure diagrams in memory 
show small addresses at the bottom and high addresses at the top. When data items are 
“aligned,” bit notation on a 64-bit data bus maps directly to bit notation in 64-bit-wide 
memory. Because byte addresses increase from right to left, strings appear in reverse order 
when illustrated according to the little-endian convention. 

■ Bit Ranges — In a range of bits, the highest and lowest bit numbers are separated by a colon, 
as in <63:0>. 

■ Bit Values — Bits can either be set to 1 or cleared to 0. 
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■ Hexadecimal and Binary Numbers — Unless the context makes interpretation clear, 
hexadecimal numbers are followed by an h, binary numbers are followed by a b, and 
decimal numbers are followed by a d. 



Related Publications 

The following books treat various aspects of computer architecture, hardware design, and 

programming that may be useful for your understanding of NexGen products: 

NexGen Products 

■ NxVL System Controller Databook, NexGen, Milpitas, CA, 

Tel: (408) 435-0202. 

■ NxPCI System Controller Databook, NexGen, Milpitas, CA, 

Tel: (408) 435-0202. 

■ NxMC Memory Controller Databook, NexGen, Milpitas, CA, 

Tel: (408) 435-0202. 

Bus Standards 

■ VESA VL-Bus Version 2.0, Video Electronics Standards Association, San Jose CA 1993. 

■ PCI Local Bus Specification Revision 2.0, Peripheral Component Interconnect Special 
Interest Group, Hillsboro, Oregon, 1993. 

x86 Architecture 

■ John Crawford and Patrick Gelsinger, Programming the 80386, Sybex, San Francisco, 1987. 

■ Rakesh Agarwal, 80x86 Architecture & Programming, Volumes I and II, Prentice-Hall, 
Englewood Cliffs, NJ, 1991. 

General References 

■ John L. Hennessy and David A. Patterson, Computer Architecture, Morgan Kaufmann 
Publishers, San Mateo, CA, 1990. 
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Nx586 Features and Signals 



The Nx586 Processor 
Features 



NexGen has independently developed a high performance x86 processor design utilizing state-of-the- 
art technologies. The Nx586 processor is the first implementation of NexGen's innovative and 
patented RISC86 microarchitecture and also includes the five key elements found in 5th generation 
processors: Superscalar execution, on-chip Harvard architecture LI code and data caches, branch 
prediction, 64-bit wide buses, and advanced floating point capabilities. The NexGen Nx586 
processor is an advanced 5th generation 32-bit Superscalar x86 compatible processor that provides 
market leading performance. The Nx586 represents the core building block of a new class of 
personal computers. 

The following are some of the key features of the Nx586 Processor: 

■ Full x86 Binary Compatibility — Supports 8, 16 and 32-bit data types and operates in real, 
virtual 8086 and protected modes. 

■ Patented RISC86™ Superscalar Microarchitecture — Multiple operations are executed 
simultaneously during each cycle. 

■ Multi-Level Storage Hierarchy — Branch prediction, readable write queue, on-chip LI 
code and data caches and unified L2 cache. 

■ Separate (Harvard Architecture) on-chip LI Code and Data Caches — supports on-chip 
4-way, 16kByte Code and 16kByte Data caches using MESI Cache Consistency Protocol. 

■ On-Chip L2 Cache Controller — supporting 4- way, unified, MESI modified write-back 
cache coherency protocol on 256kB or 1MB of external cache using standard asynchronous 
SRAMs. 

■ Patented Branch Prediction Logic — Reduces both control dependencies and branch cycle 
counts. 

■ Dual-Port Caches — 64-bit reads and writes are serviced in parallel in a single clock cycle. 

■ Caches Decoupled From Processor Bus — Both the LI and L2 caches are accessed on 
separate dedicated buses. 

■ Two-Phase, Non-Overlapped Clocking — Integrated phase-locked loop bus-clock doubler. 
Processor operates at twice the system bus frequency. 

■ Three 64-Bit Synchronous Buses — NexBus (the processor bus), L2 SRAM bus, and 
internal Floating-Point Unit bus and is fully integrated into the processor microarchitecture. 

■ NexBus and NexBus^ Support — The Nx586 supports both NexBus interface protocols. 
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Nx586 Processor 

with Floating-Point Execution Unit 
Features 



The NexGen Nx586 Processor is available with an integrated floating-point execution unit. The 
floating-point execution unit is an expansion of the Nx586 superscalar pipelined microarchitecture. 
It adds specific x86 architecture floating point operations including arithmetic, exponential, 
logarithmic, and trigonometric functions. This execution unit is part of the RISC86 pipeline to 
ensure maximum floating-point calculation speed. This version of the Nx586 is plug compatible 
with the Nx586. The following are some of the key features: 

■ Nx586 Feature Set — Includes all the features of the Nx586. 

■ MCM Technology — The Processor and Floating-Point Unit are housed in a Multi-Chip- 
Module. 

■ Fully Integrated Floating-Point unit into RISC86 Microarchitecture — Operates in 
parallel with the Nx586 Address, and Integer Units. Increased performance due to 
Speculative floating-point requests. 

■ Binary Compatible — Runs all x86-architecture floating-point binary code. 

■ Optional — No hardware reconfiguration necessary if not present. Pin compatible with 
Nx586. 

■ Dedicated Internal 64-Bit Processor Bus — Fast, synchronous, non-multiplexed interface 
to Nx586 Processor Core. 

■ High Bus Bandwidth — Simple arbitration on the Floating-Point bus to maximize 
bandwidth. Arbitration and data transfers occur in parallel, one clock apart. 
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Nx586 Features and Signals 



The Nx586 processor fully implements the industry standard x86 instruction set to be able to run the 
vast amount applications and operating systems available. This implementation is accomplished 
through the use of NexGen’s patented RISC86 microarchitecture. The innovative RISC86 approach 
dynamically translates x86 instructions into RISC86 instructions. As shown in the figure below, the 
Nx586 takes advantage of RISC performance principles. Due to the RISC86 environment, each 
execution unit is more specialized, smaller and compact. The RISC86 microarchitecture contains 
many state-of-the-art computer science techniques to achieve very high performance, including 
Register Renaming, Data Forwarding, Speculative execution, and Out-of-Order execution. 




Figure 1 Nx586 Functional Block Diagram 

The Level-2 cache controller is on chip for increased performance and reduced access overhead. 
The L2 cache controller does not have to arbitrate for the dedicated L2 Cache bus. L2 cache 
accesses can begin on any clock cycle. The greatest advantage comes when the CPU operating 
frequency is high. Accesses to the L2 cache remain at full speed and not at the slower system bus 
rate. Therefore, the Nx586 scales in performance linearly with respect to the operating frequency. 
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Nx586 Signals 



Figure 2 shows the signal organization for the Nx586 processor. The processor core supports signals 
for NexBus (the processor bus) or NexBus-^ (the system bus), L2 cache, and the optional Floating- 
Point Unit. Many types of devices can be interfaced to the NexBus^, including a backplane, multiple 
Nx586 processors, shared memory subsystems, high-speed I/O, and industry-standard buses. All 
signals are synchronous to the NexBus^ clock (NxCLK) and transition at the rising edge of the clock 
with the exception of four asynchronous signals: INTR*, NMI*, GATEA20, and SLOTID<3:0>. 
All bi-directional NexBus^ signals are floated unless they are needed during specific time periods, as 
specified in the Bus Operation chapter. The normal state for all reserved bits is high. 

NexBus is the original processor protocol that defines how a localized processor, coprocessor and L2 
cache are connected to the NexBus^ system interface protocol in a multi-processor type of 
environment. Processors using the NexBus standard must provide bus transceivers to convert the 
NexBus interface to NexBus^. 

One type of NexBus signals deserve special mention: 

■ Buffered Address and Data Bus — Address, status and data phases are multiplexed on the 
AD<63:0> bus. This bus is interfaced to NexBus^ through transceivers, for which control 
signals are provided by the processor. 

Two types of NexBus^ signals deserve special mention: 

■ Group Signals — There are several group signals on the NexBus-*, typically denoted by 
signal names beginning with the letter "G." Active-low signals such as ALE* are driven by 
each NexBus^ device, and the NexBus^arbiter derives an active-high group signal (such as 
GALE) and distributes it back to each device. 

■ Central Bus Arbitration — Access to the NexBus^ is arbitrated by an external NexBus^ 
Arbiter. NexBus^ masters request and are granted access by this Arbiter. For the Nx586 
processor, central bus arbitration has the advantage of back-to-back processor access most 
of the time while supporting fast switching between masters. Typical systems logic will 
provide the combined functions of NexBus^ Arbiter, and Alternate-Bus Interface (the 
system-logic interface to other system buses). The memory controller function may be 
included or designed as a separate device connected to NexBus^. 
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L2 Cache 

Cache Control 
Cache Bank 
Address 
Data 




Figure 2 Nx586 Signal Organization 
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Nx586 Pinouts by Signal Names 
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Figure 3 Nx586 Pin List, By Signal Name 
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Figure 3 Nx586 Pin List, By Signal Name (continued) 
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MB 


i 


VSS 


W3M 


BSEB 


i 


VSS 


EM 


BSHM 


i 


VSS 


eot 


Ms&m 


i 


VSS 


EEI 




i 


VSS 


EOT 


■crew 


i 


VSS 


EIE1 


FBM.1 


i 


VSS 


Elkl 


BEOT 


i 


VSS 


EOT 


■igittW 


i 


VSS 


gna 


BBOTI 


i 


VSS 


EOT 


■IKM 


i 


VSS 


EOT 


BBgM 


i 


VSS 


EOT 


KESM 


i 


VSS 


Ena 




MB 


VSS 


EMil 


ffiEM 


i 


VSS 


EETF 


Kicrcua 


i 


VSS 


e ra 


■MEW 


i 


VSS 


Era 




i 


VSS 


EfegF 


■JgkfcF 


i 


VSS 


Era 


K;Ka 


i 


VSS 






0 


XACK* 


EM 




0 


XBCKE* 


EOT 




0 


XBOE* 


Eg*l 


BSM 


WZBB 




EEE1 


■a 


0 


XHLD* 


Era 


ESEM 


0 


XNOE* 


FctrftB 




0 


XPH1 


EOT 


BIJciB 


0 


XPH2 


EOT 




0 


XREF 




FTicM 


1 


XSEL 



Figure 3 Nx586 Pin List, By Signal Name (continued) 
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Nx586 Pinouts by PGA Pin Numbers 




Signal 

Name 



NC 



NC 



NC 



NC 



NPTAG<0> 



CDATA<6> 



CDATA<1> 



CDATA<15> 



CDATA<8> 



CDATA<23> 



CDATA<16> 



NC 



CDATA<24> 



CADDR<7> 



CADDR<9> 



CBANK<1> 



NC 



ANALYZEIN 



NC 



NC 



NC 



NC 



NC 



CDATA<4> 



NC 



CDATA<13> 



CDATA<10> 



CDATA<21> 



CDATA<18> 



CDATA<30> 



CDATA<26> 



CADDR<6> 



CADDR<8> 



CADDR<10> 



CADDR<17> 



HROM 



NC 



VSS 



VSS 



VSS 



VSS 



VSS 



VSS 



VSS 



VSS 



VSS 



VSS 



VSS 



VSS 



VSS 



VSS 



VSS 



VSS 



CWE<4>* 



NC 






■EM M 
KGrM Wm 
■EM 

■Ell 




Signal 

Name 



VCC4 



VCC4 



VCC4 



VCC4 



VCC4 



VCC4 



VCC4 



VCC4 



VCC4 



VCC4 



VCC4 



VCC4 



VCC4 



VCC4 



VCC4 



VCC4 



NxADINUSE 



NC 



VSS 



P4REF 



NC 



NC 



NC 



CDATA<5> 



CDATA<2> 



CDATA<14> 



CDATA<9> 



NC 



CDATA<29> 



CDATA<27> 



CDATA<36> 



CADDR<13> 



CBANK<0> 



CADDR<11> 



VSS 



CDATA<35> 



NC 



VCC4 



NC 



NC 



NC 



NC 



CDATA<7> 



CDATA<0> 



NC 



CDATA<22> 



CDATA<17> 



CDATA<31> 



CDATA<25> 



CADDR<14> 



CADDR<12> 



TPH1 



VCC4 



CDATA<34> 



NPTAG<3> 



VSS 




■ElBM 

EBIM I 

mmwm 

HUES 



■Bl BBlI 

IKf^lSSj 

■El 

wmt m 

H HBII 

B1 SI 




j ffEEl IggM 

BcillBlI 

ca gi 

■EElMSMj 



Eli 



KMdiwni j 

■E 1 KH 



in— | 

■na mgi i 



Signal 

Name 



GREF 



PULLHIGH 



NC 



NC 



CWE<0>* 



CDATA<12> 



CDATA<11> 



CWE<2>* 



CDATA<19> 



CDATA<28> 



CADDR<4> 



CADDR<5> 



CDATA<37> 



TPH2 



SLOTID<3> 



VSS 



CDATA<33> 



SLOTID<0> 



VCC4 



NC 



PULLHIGH 



NC 



NC 



CDATA<3> 



CWE<1>* 



COEB* 



CDATA<20> 



CWE<3>* 



CADDR<3> 



CADDR<15> 



NC 



SERIALIN 



PULLHIGH 



VCC4 



CDATA<32> 



NC 



VSS 



SCLKE 



NC 



POPHOLD 



VSS 



CDATA<44> 



NC 



VCC4 



NC 



NC 



PULLHIGH 



SLOTID<1> 



VCC4 



CDATA<45> 



NC 



VSS 



NC 



NC 



ANALYZEOUT 




■jama! 

MZMMm 



mmi 

■EE1 I 

■El l 



E3B I 

Eamai 

■jEa wsBi 

KkHISEi 

El 
■Ell 
E3— 
ESI IBB 




wxmmem 

El l 
wy*m \ 
fe&ij 



Signal 

Name 



CADDR<16> 



VSS 



CDATA<46> 



NC 



VCC4 



NC 



NC 



CDATA<38> 



CWE<5>* 



VCC4 



CDATA<47> 



NC 



VSS 



NC 



NC 



CDATA<39> 



CDATA<43> 



VSS 



NC 



NC 



VCC4 



NC 



NC 



COEA* 



CDATA<41> 



VCC4 



CDATA<42> 



NC 



VSS 



NC 



NC 



CWE<6>* 



CDATA<52> 



VSS 



CDATA<40> 



NC 



VCC4 



NC 



NC 



CDATA<54> 



VCC4 



CDATA<53> 



NC 



VSS 



RESET* 



NC 



CWE<7>* 



CDATA<48> 



VSS 



CDATA<55> 



NC 



VCC4 



NC 



NC 



CDATA<51> 



Figure 4 
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Figure 4 Nx586 Pin List, By PGA Pin Number (continued) 
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Figure 5 Nx586 Pin List, By JEDEC Pin Number 
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Figure 5 Nx586 Pin List, By JEDEC Pin Number (continued) 
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Nx586 NexBus/NexBus 5 Signals 



NexBus/NexBus 5 Arbitration 



NREQ* 


0 


NexBus Request — Asserted by the processor to the NexBus^ 
arbiter to secure control of the system bus. The Nx586 will 
drive NREQ* active on the rising edge of NxCLK. The bus 
is granted when the arbiter asserts GNT*. The "grant" 
becomes effective only when the Nx586 asserts ALE* or 
LOCK*. This signal remains active until one NxCLK period 
after GALE is received from the NexBus arbiter. During 
speculative reads, the Nx586 may deactivate NREQ* before 
GNT* is received if the transfer is no longer needed. 






If the processor does not know which bus its intended 
resource is on, it asserts NREQ*. If a GTAL is subsequently 
returned, the processor assumes the resources are on another 
system bus and it retries the transfer by asserting AREQ*. 






The processor at anytime may perform speculative cycles 
that prematurely terminate. This is done by asserting 
NREQ* and then subsequently removing NREQ* before 
GNT* is asserted. 


AREQ* 


0 


Alternate-Bus Request — Asserted by the processor to the 
NexBus arbiter to secure control of the system bus and any 
other buses (called alternate buses ) supported by the system. 
This signal remains active until GNT* is received from the 
NexBus^ Arbiter; unlike NREQ*, the processor does not 
make speculative requests with AREQ*. The arbiter does not 
issue GNT* until the other system buses are available. 
AREQ* is driven on the rising edge of NxCLK. 


GNT* 


I 


Grant NexBus — Asserted by the NexBus^ arbiter to indicate 
that the processor has been granted control of the system bus. 
GNT* is asserted on the rising edge of NxCLK and is held 
active until a valid ALE*. GNT* can be active for a 
minimum of two NxCLKs if ALE* is driven immediately 
after GNT* is received. 
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LOCK* 


0 


Bus Lock — Asserted by the processor to the NexBus^ arbiter 
when multiple bus operations should be performed 
sequentially and uninterruptedly. This signal is used by the 
NexBus^arbiter to determine the end of a bus sequence. 
Cache-block fills are not locked; they are implicitly treated 
as atomic reads. Some NexBus arbiters may allow masters 
on system buses other than NexBus^ (i.e., on an alternate 
bus) to intervene in a locked NexBus^ transaction. To avoid 
this, the processor must assert AREQ*. 

LOCK* is typically software configured to be asserted for 
read-modify- writes and explicitly locked instructions. 


SLOTID<3:0> 


I 


NexBus Slot ID — These bits identify NexBus^ backplane 
slots. SLOTID 1111 (OFh) is reserved for the system’s 
primary processor. Normally, only the primary processor 
receives PC-compatible signals such as RESET*, 
RESETCPU*, INTR*, NMI*, and GATEA20, and this 
processor is responsible for initializing any secondary 
processors. SLOTID 0000 is reserved for the systems logic 
that interfaces the NexBus^ to other system buses (called the 
alternate-bus interface). This signal is asynchronous to the 
NexBus clock. 
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NexBus/NexBus 5 Cycle Control 



ALE* 


O 


Address Latch Enable — Asserted by the processor to 
backplane logic or to the systems logic interface between the 
NexBus^ and other system buses (called the alternate-bus 
interface) when the processor is driving valid addresses and 
status information on the NxAD<63:0> bus. ALE* is driven 
active on the rising edge of NxCLK after GNT* is received 
for one NxCLK. All ALE* signals are NANDed on the bus 
backplane or systems logic to generate GALE. 


GALE 


I 


Group Address Latch Enable — Asserted by a backplane 
NAND of all ALE* signals, to indicate that the NexBus^ 
address and status can be latched. GALE should be 
monitored by all devices on NexBus^ to latch the address 
placed on the bus by the master. 


GTAL 


I 


Group Try Again Later — Asserted by the systems logic 
interface between NexBus^ and other system buses (called 
the alternate-bus interface) to indicate that the attempted 
bus-crossing operation cannot be completed, because the 
systems logic bus interface is busy or cannot access the other 
system buses. In response, the processor aborts its current 
operation and attempts to re-try it by asserting AREQ*, 
thereby assuring that the processor will not receive a GNT* 
until the desired system bus is available. 






A bus-crossing operation can happen without the systems 
logic bus interface asserting GTAL and without the processor 
asserting AREQ*, if the other system bus and its systems 
logic interface are both available when the processor asserts 
NREQ*. The GTAL and AREQ* protocol is only used when 
NREQ* is asserted while either the other system bus or its 
systems logic interface is unavailable. The protocol prevents 
deadlocks and prevents the processor from staying on 
NexBus^ until the other system bus becomes available. 






Unlike other group signals, which are the NAND of a set of 
active-low signals generated by each participating device in 
the group, GTAL does not have such a corresponding active- 
low signal. 


XACK* 


O 


Transfer Acknowledge — This signal is driven active by the 
processor during a NexBus^ snoop cycle (Alternate Bus 
Master cycle), when the processor determines that it has data 
from the snooped address. 
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GXACK 



XHLD* 



GXHLD 



I Group Transfer Acknowledge — Asserted by a backplane 
NAND of all XACK* signals, to indicate that a NexBus^ 
device is prepared to respond as a slave to the processor’s 
current operation. The systems logic interface between the 
NexBus^ and other system buses (called the alternate-bus 
interface ) monitors the XACK* responses from all adapters. 

In general, since the systems logic interface to other system 
buses may take a variable number of cycles to respond to a 
GALE, the maximum time between assertion of GALE and 
the responding assertion of GXACK is not specified. 

0 Transfer Hold — Asserted by the processor, as slave or 
master, to backplane logic or to the systems logic interface 
between NexBus-^ and other system buses (called the 
alternate-bus interface) in response to another NexBus^ 
master's request for data, when the processor is unable to 
respond on the next clock after GXACK. 

In case the processor is the master, an active XHLD* 
indicates that the CPU is not ready to complete the transfer 
(This situtation may occur for speculative cycles). Slaves 
supply read data in the clock following the first clock during 
which GXACK is asserted and GXHLD (via XHLD* 
negated) is negated. 

1 Group Transfer Hold — Asserted by a backplane NAND of 
all XHLD* signals, to indicate that a slave cannot respond to 
the processor's request. GXHLD causes wait states to be 
inserted into the current operation. Both the master and the 
slave must monitor GXHLD to synchronize data transfers. 

During a bus-crossing read by the processor, the 
simultaneous assertion of GXACK and negation of GXHLD 
indicates that valid data is available on the bus. During a 
bus-crossing write, the same signal states indicate that data 
has been accepted by the slave. 
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NexBus Cache Control 

DCL* 0 Dirty Cache Line — During reads by another NexBus^ 

master, this signal is asserted by the processor to indicate 
that the location being accessed is contained in the 
processor's L2 cache in a modified (dirty) state. 

The requesting master's cycle is then aborted so that the 
processor, as an intervenor, can preemptively gain control of 
the NexBus^ and write back its modified data to main 
memory. While the data is being written to memory, the 
requesting master reads it off NexBus^. The assertion of 
DCL* is the only way in which atomic 32-byte cache-block 
fills by another NexBus^ master can be preempted by the 
processor for the purpose of writing back dirty data. 

During writes by another NexBus^ master, this signal is 
likewise asserted by the processor to indicate that it has a 
modified copy of the data. But in this case, the initiating 
master is allowed to finish its write to memory. The arbiter 
must then guarantee that the processor asserting DCL* gains 
access to the bus in the very next arbitration grant, so that the 
processor can write back all of its modified data except the 
bytes written by the initiating master. (In this case, the 
initiating master's data is more recent than the data cached 
by the processor asserting DCL*.) 

GDCL I Group Dirty Cache Line — Asserted by a backplane NAND 

of all DCL* signals, to indicate that a NexBus^ device has, 
in its cache, a modified copy of the data being accessed. 
During reads, when the processor is the bus master, the 
processor aborts its cycle so that the other caching device 
can write back its data; the processor reads the data on the 
fly. During writes, when the processor is the bus master, the 
processor finishes its write before the device asserting DCL* 
writes back all bytes other than those written by the 
processor. 

GBLKNBL I Group Block Enable — Asserted by a memory slave to 

enable block transfers, and to indicate that the addressed 
space may be cached. Paged devices (such as video 
adapters) and any other devices that cannot support burst 
transfers or whose data is non-cacheable should negate this 
signal. 
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OWNABL 


I 


Ownable — Asserted by the systems logic during accesses by 
the processor to locations that may be cached in the exclusive 
state. Negated during accesses that may only be cached in 
the shared state, such as bus-crossing accesses to an address 
space that cannot support the MESI cache-coherency 
protocol. All NexBus^ addresses are assumed to be 
cacheable in the exclusive state. 

The OWNABL signal is provided in case the systems logic 
needs to restrict caching to certain locations. In single- 
processor systems, the OWNABL signal is typically tied 
high for write-back configurations to allow caching in the 
exclusive state on all reads. 


SHARE* 


0 


Shared Data — The purpose of SHARE* is to let NexBus^ 
caching devices (including caching devices on an alternate 
bus) indicate that the current read operation hit in a cache 
block that is present in another device's cache. It is asserted 
by the Nx586 during block reads by another NexBus^ master 
to indicate to the other master that its read hit is in a block 
cached by the processor. 


GSHARE 


I 


Group Shared Data — Asserted by a backplane NAND of all 
SHARE* signals, to indicate that the data being read must be 
cached in the shared state, if OWN* (NxAD<49>) is 
negated. However, if GSHARE and OWN* are both negated 
during the read, the data may be promoted to the exclusive 
state, since no other NexBus^ device has declared via 
SHARE* that it has cached a copy. Instruction fetches are 
always shared. 
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NexBus Transceivers 



XBCKE* 


0 


NxAD Transceiver Bus Clock Enable — Asserted by the 
processor to clock registered transceivers and latch 
addresses/status and data from the AD<63:0> bus for 
subsequent driving onto the NxAD<63:0> bus. There is no 
comparable clock-enable for the NexBus^ side of these 
transceivers; they are always enabled on the NexBus^ side. 
Note, NxCLK is normally connected to the clocking pin for 
the AD<63:0> registers and an inverted NxCLK is connected 
to the clocking pin for the NxAD<63:0> registers. 


XBOE* 


0 


Transceiver to AD Bus Output Enable — Asserted by the 
processor to enable the registered transceivers and drive 
addresses and data onto the AD<63:0> bus from the 
NxAD<63:0> bus. Note, NxCLK is normally connected to 
the clocking pin for the AD<63:0> registers and an inverted 
NxCLK is connected to the clocking pin for the 
NxAD<63:0> registers. 


XNOE* 


0 


Transceiver to NxAD Bus Output Enable — Asserted by the 
processor to enable registered transceivers and drive 
addresses and data onto the NxAD<63:0> bus from the 
AD<63:0> bus. Note, NxCLK is normally connected to the 
clocking pin for the AD<63:0> registers and an inverted 
NxCLK is connected to the clocking pin for the 
NxAD<63:0> registers. 


XCVERE* 


I 


NexBus^ Transceiver Enable — XCVERE* determines what 
type of bus is generated by the processor. When pulled high, 
the Nx586 will generate the NexBus processor bus which 
requires external transceivers to connect to the processor to 
the NexBus^ system bus. If XCVERE* is tied low, the 
Nx586 generates NexBus^ directly. This pin is sampled by 
the processor during reset active. 
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NxAD<63:0> 

AD<63:0> 



NexBus/NexBus 5 Address and Data 



NexBus or NexBus 5 Address and Status, or Data — This 
bus multiplexes address and status information during the 
"address and status phase" and with up to 64 bits of data 
during a subsequent "data phase". XCVERE* determines the 
local bus mode. The Nx586 generates NexBus (AD) for 
XCVERE* asserted and NexBus^ for XCVERE'* negated. 
The NexBus address and status is valid on the rising edge of 
XBCKE*. 

For either bus modes, the address and status is valid on 
NexBus^ when GALE is asserted. At that time, address 
NxAD<63:32> and status NxAD<31:0> is latched. The data 
phase occurs on the cycle after GXACK is asserted and 
GXHLD is simultaneously negated. 

To avoid contention, the two phases are separated by a 
guaranteed dead cycle (a minimum of one clock) which 
occurs between the assertion of GALE and the assertion of 
GXACK. 




Figure 1 0 NexBus/NexBus 5 Address and Status Phase 
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NxAD<l:0> 

AD<1:0> 

address phase 


I/O 


Reserved — These bits must be driven high by the bus 
master. 


NxAD<2> 

AD<2> 

address phase 


I/O 


ADDRESS<2> (Dword Address) — For I/O cycles, this bit 
selects between the four-byte doublewords (dwords) in an 
eight-byte quadword (qword). For memory cycles, the bit is 
driven but the information is not normally used. 


NxAD<31:3> 

AD<31:3> 

address phase 


I/O 


ADDRESS<31:3> (Qword Address) — For memory cycles, 
these bits address an eight-byte quadword (qword) within the 
4GB memory address space. For I/O cycles, NxAD<15:3> 
specifies a qword within the 64kB I/O address space and 
NxAD<31:16> are driven low by the processor. In either 
case, the addressed data may be further restricted by the 
BE<7:0>* bits on NxAD<39:32>. Memory cycles (but not 
I/O cycles) may be expanded to additional consecutive 
qwords by the BLKSIZ<1:0>* bits on NxAD<51:50>. 


NxAD<39:32> 

AD<39:2> 

address phase 


I/O 


BE<7:0>* (Byte Enables) — Byte-enable bits for the data 
phase of the NxAD<63:0> bus. BE<0>* corresponds to the 
byte on NxAD<7:0>, and BE<7>* corresponds to the byte 
on NxAD<63:56>. The meaning of these bytes is shown in 
Figure 11 and 12. 

For I/O cycles, BE<3:0>* specify the bytes to be transferred 
on NxAD<31:0> and BE<7:4>* are driven high by the 
processor. For memory cycles, all eight bits are used to 
specify the bytes to be transferred on NxAD<63:0>. 
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Transfer Type 


Meaning of BE< 7: 0> * 


I/O 


BE<3:0>* specify the bytes to transfer 
on NxAD<31:0>. BE<7:4>* are driven 
high by the processor. 



Figure 1 1 Byte-Enab!e Usage during 1/0 Transfers 



Transfer Type 


Meaning of BE< 7 :0> * 


Memory 


Single Qword Read or 
Write 


BE<7:0>* specify the bytes to transfer 
on NxAD<63:0>. 


Four-Qword Block Write 


BE<7:0>* specify the bytes to transfer 
on NxAD<63:0> for first qword only. 
For all other qwords, BE<3:0>* are 
implicit zeros, and all bytes are 
transferred. 


Four-Qword Block Read 
(Cache-Block Fill) 


BE<7:0>* specify the bytes that are to 
be fetched immediately. 



Figure 12 Byte-Enable Usage during Memory Transfers 



NxAD<45:40> 


I/O 


MID<5:0> (Master ID) — These bits indicate to a slave, and 


AD<45:40> 




to the system-logic interface between the NexBus and other 


address phase 




system buses (called the alternate-bus interface) during bus- 
crossing cycles, the identity of the NexBus master that 
initiated the cycle. The most-significant four bits are the 
device's SLOTID<3:0> bits. The least-significant two bits 
are the device's DEVICE<1:0> bits. MID 000000 is reserved 
for the systems logic. 
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NxAD<46> 
AD<46> 
address phase 


I/O 


W/R* (Write or Read*) — This bit distinguishes between 
read and write operations on the NexBus. Bus cycle types are 
interpreted as shown in Figure 13. 


NxAD<47> 
AD<47> 
address phase 


I/O 


D/C* (Data or Code*) — This bit distinguishes between data 
and code operations on the NexBus. Bus cycle types are 
interpreted as shown in Figure 13. 


NxAD<48> 
AD<48> 
address phase 


I/O 


M/IO* (Memory or I/O*) — This bit distinguishes between 
memory and I/O operations on the NexBus. Bus cycle types 
are interpreted as shown in Figure 13. 



NxAD<48> 

M/IO* 


NxAD<47> 

D/C* 


NxAD<46> 
W/R * 


Type of Bus Cycle 


0 


0 


0 


Interrupt Acknowledge 


0 


0 


1 


Halt or Shutdown 


0 


1 


0 


I/O Data Read 


0 


1 


1 


I/O Data Write 


1 


0 


0 


Memory Code Read 


1 


0 


1 


(reserved) 


1 


1 


0 


Memory Data Read 


1 


1 


1 


Memory Data Write 



Figure 13 Bus-Cycle Types 



NxAD<49> 
AD<49> 
address phase 


I/O 


Ownership Request — Asserted by a master when it intends 
to cache data in the exclusive state. This bit is asserted for 
write-backs and reads from the stack. If such an operation 
hits in the cache of another master, that master writes its data 
back (if copy is modified) and changes the state of its copy 
to invalid. If OWN* is negated during a read or write, 
another master may not assume that the copy is in shared 
state when not asserting SHARE* signal. 


NxAD<50> 
AD<50> 
address phase 


I/O 


Reserved — This bit must be driven high. 
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I/O 



NxAD<51> 
AD<51> 
address phase 



BLKSIZ* (Block Size) — For memory operations, this bit 
defines the number of transfers. It is low for four-qword 
transfers and high for single byte, word, dword or qword 
cycles. For I/O operations, this bit is also driven high by the 
processor. 







For single transfers and block (burst) writes, the bytes to be 
transferred in the first qword are specified by the byte-enable 
bits, BE<7:0>* on NxAD<39:32>. If the slave is incapable 
of transferring more than a single qword, it or the system- 
logic interface between the NexBus and other system buses 
(called the alternate-bus interface) may deny a request for 
subsequent qwords by negating the GXACK or GBLKNBL 
inputs to the processor after a single-qword transfer, or after 
returning all bytes specified by BE<7:0>* in the first qword. 


NxAD<56:52> 
AD<56:52> 
address phase 


I/O 


Reserved — These bits must be driven high. 


NxAD<57> 
AD<57> 
address phase 


I/O 


SNPNBL (Snoop Enable) — Asserted to indicate that the 
current operation affects memory that may be present in 
other caches. When this signal is negated, snooping devices 
need not look up the addressed data in their cache tags. 


NxAD<58> 
AD<58> 
address phase 


I/O 


CACHBL (Cacheable) — Asserted by the bus master to 
indicate that it may cache a copy of the addressed data. The 
master typically decides what it will cache, based on 
software-configured address ranges. This bit supports higher- 
performance designs by letting the NexBus interface know 
what the master intends to do with the data, thereby allowing 
other devices to sometimes prevent unnecessary invalidation 
or write-backs. 


NxAD<63:59> 
AD<63:59> 
address phase 


I/O 


Reserved — These bits must be driven high by the bus 
master. 
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Nx586 L2 Cache Signals 



SRAMMODE 


I 


L2 Cache SRAM mode Select — Selects the use of either 
synchronous or asynchronous SRAM for the L2 cache 
memory. This pin is sampled during reset active. When 
SRAMMODE is left unconnected (floating), the internal pull 
down resistor configures the Nx586 for asynchronous SRAMs. 
If SRAMMODE is pulled high, the Nx586 is configured for 
synchronous SRAMs. In synchronous mode, the CKMODE 
pin generates the SRAM clocks and COEB * generates global 
SRAM write enables after RESET. Also, CKMODE can be 
disabled by driving SCLKE low or inactive. SRAMMODE 
contains an internal pull down resistor. 


COEA* 


0 


L2 Cache Output Enable A — Enables reading from second- 
level cache SRAMs to drive the CDATA<63:0> bus. COEA* 
should be connected to a maximum of four devices. 


COEB*/WE* 


0 


L2 Cache Output Enable B — Enables reading from second- 
level cache SRAMs to drive the CDATA<63:0> bus. COEB* 
should be connected to a maximum of four devices. COEB* 
has the identical function as COEA* when SRAMMODE is 
low. Global Write Enable — When SRAMMODE is pulled 
high, COEB* is reconfigured as a global write enable for 
synchronous SRAMs. 


CWE<7:0>* 


0 


L2 Cache Write Enable — Enables writing to the second- 
level cache SRAMs. The CWE<0>* bit enables writing the 
byte on CDATA<7:0>. The CWE<7>* bit enables writing the 
byte on CDATA<63:56>. 


CBANK<1:0> 


0 


L2 Cache Bank — Selects one of four banks (sets) in the four- 
way set associative second-level cache. Each bank is either 
64kB or 256kB. These signals should be connected to the two 
least-significant address bits of the SRAMs. 


CADDR<17:3> 


0 


L2 Cache Address — The address of an eight-byte quantity in 
the second-level cache bank selected by CBANK<1:0>. Bits 
17:16 are not used for a 256kB L2 cache; they are only used 
for a 1MB cache. 


CDATA<63:0> 


I/O 


L2 Cache Data — Carries either one to eight bytes of second- 
level cache data, or the tags and state bits for one to four 
second-level cache banks (sets). Transfers on this bus occur at 
the peak rate of eight bytes every two processor clocks, but 
the transfers can begin on any processor clock. 
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Nx586 Clocks 



NxCLK 


I 


NexBus Clock — A TTL-level clock. All signals on 
NexBus/NexBus^ transition on the rising edge of NxCLK, 
except the asynchronous signals, INTR*, NMI*, GATEA20, 
and SLOTID<3:0>. If the Nx586 is configured for internal 
PLL mode, the processor's internal phase-locked loop (PLL) 
synchronizes the internal processor clocks at twice the 
frequency of NxCLK. 

For external PLL mode, NxCLK is used to generate the skew 
correcting reference clock for the external PLL circuitry. 
The skewed version of NxCLK is produced at XREF. 


PHE1 


I 


Clock Phase 1 — PHE1 is used as the processor clocking 
source when the Nx586 is configured for external PLL mode. 
The deskewed clock (normally twice the frequency of 
XREF) generated by the external PLL circuitry is connected 
to PHE1. For normal clocking operation, this signal should 
be pulled low. 


PHE2 


I 


Clock Phase 2 — PHE2 determines the relationship between 
the internal non-overlapped clocks. When pulled low, 
narrow non-overlapped clocks are generated. Wide non- 
overlapped clocks are produced for PHE2 pulled high. For 
normal clocking operation, this signal should be pulled low. 


SCLKE 


I 


Synchronous Clock Enable — While in synchronous SRAM 
mode (see SRAMMODE), SCLKE is used to determine the 
output of CKMODE. If SCLKE is asserted, CKMODE 
generates a clock equal to the processor's internal frequency 
(twice NxCLK). While inactive, CKMODE is driven low. 
For normal clocking operation, this signal should be pulled 
low. 


CKMODE 


I 


Clock Mode — For normal clocking operation, this signal 
should be pulled low. When SRAMMODE is pulled high, 
the Nx586 is configured for synchronous SRAMs. In 
synchronous SRAM mode, the CKMODE pin generates the 
SRAM clocks and COEB* generates global L2 SRAM write 
enables after RESET is inactive. 
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XSEL 


I 


Clock Mode Select — XSEL is used to select which PLL 
mode is utilized by the processor, internal or external. 
Internal PPL mode is selected when XSEL is tied low. For 
XSEL pulled high, the external PLL mode is selected. For 
normal clocking operation, this signal should be tied low. 


XPH1 


O 


Processor Clock Phase 1 — For normal clocking operation, 
this signal must be left unconnected. 


XPH2 


0 


Processor Clock Phase 2 — For normal clocking operation, 
this signal must be left unconnected. 


IREF 


I 


Clock Current Reference — This signal must be pulled up to 
Vdda- R e f er t0 NexGen for the optimal value. 


XREF 


0 


Clock Output Reference — For normal clocking operation, 
this signal must be terminated with a value that matches the 
characteristic impedance of the circuit board (PCB). A 
Thevenin type of termination is recommended. 

In external PLL mode, XREF is the skewed version of 
NxCLK and is normally connected to the input of the 
external PLL circuitry's phase comparator. 


VDDA 


I 


PLL Analog Power — This input provides power for the on 
chip PLL circuitry and should be isolated from Vc.C. by a 
ferrite bead and decoupled with a 0.1 jllF ceramic capacitor. 
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Nx586 Interrupts and Reset 



NPIRQ* 


0 


Floating Point Unit Interrupt Request — Asserted by the 
Floating-Point unit to the interrupt controller's IRQ 13 that 
services floating-point errors in an PC-AT. 


INTR* 


I 


Maskable Interrupt — Level sensitive. This signal is 
asserted by an interrupt controller. The processor responds 
by stopping its current flow of instructions at the next 
instruction boundary, aborting earlier instructions that have 
been partially executed, and performing an interrupt 
acknowledge sequence, as described in the Bus Operations 
chapter. This signal is asynchronous to NxCLK. 


NMI* 


I 


Non-Maskable Interrupt — Edge sensitive. Asserted by 
systems logic. The effect of this signal is similar to INTR*, 
except that NMI* cannot be masked by software, the 
interrupt acknowledge sequence is not performed, and the 
handler is always located by interrupt vector 2 in the 
interrupt descriptor table. This signal is asynchronous to the 
processor and to NxCLK. 


RESET* 


I 


Global Reset (Power-Up Reset) — Asserted by systems 
logic. The processor responds by resetting its internal state 
machines and loading default values into its registers and 
reading the hardware configuration pins (i.e. SRAMMODE, 
CKMODE, XSEL, etc.). At power-up it must remain asserted 
for a minimum of 1 millisecond after VCC and NxCLK have 
reached their proper AC and DC specifications. 


RESETCPU* 


I 


Reset CPU (Soft Reset) — Asserted by the systems logic to 
reset the processor without changing the state of memory or 
the processor's caches. This signal is normally routed only to 
the primary processor in SLOTID OFh. 


GATEA20 


I 


Gate Address 20 — When asserted by the system controller 
or keyboard controller, the processor drives bit 20 of the 
physical address at its current value. When negated, address 
bit 20 is cleared to zero, causing the address to wrap around 
into a 20-bit address space. GATEA20 is asynchronous to the 
NexBus clock. 

This method replicates the PC-AT processor's handling of 
address wraparound. All physical addresses are affected by 
the ANDing of GATEA20 with address bit 20, including 
cached addresses. This signal is asynchronous to the 
processor's internal clock and to NxCLK. 
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Nx586 Test and Reserved Signals 



ANALYZEIN 


I 


Reserved — This signal must be pulled low for normal 
operation. 


ANALYZEOUT 


0 


Reserved — This signal must be left unconnected for normal 
operation. 


NC 


- 


Reserved — These signals must be left unconnected. 


GREF 


o 


Ground Reference — This signal must be left unconnected 
for normal operation. 


HROM 


I 


Reserved — This signal must be pulled low. 


P4REF 


0 


Power Reference — This signal must be left unconnected for 
normal operation. 


POPHOLD 


I 


Reserved — This signal must be pulled low for normal 
operation. 


PTEST 


I 


Processor TEST — This pin tri-states all outputs except for 
the following pins: XPH1, XPH2, and XREF. For normal 
operation, this input must be pulled low. 


PULLHIGH 


I/O 


Reserved — These signals must be individually pulled high to 
VCC4 for normal operation. 


PULLLOW 


I/O 


Reserved — These signals must be individually pulled low 
for normal operation. 


SERIALIN 


0 


Serial In — The input of the scan-test chain. This signal must 
be left unconnected for normal operation. 


SERIALOUT 


0 


Serial Out — The output of the scan-test chain. This signal 
must be left unconnected for normal operation. 


TESTPWR* 


I 


Test Power — Powers-down CPU's static circuits during scan 
tests. This signal must be pulled high for normal operation. 


TPH1 


I 


Test Phase 1 Clock — For scan test support. This signal must 
be pulled low for normal operation. 


TPH2 


I 


Test Phase 2 Clock — For scan test support. This signal must 
be pulled low for normal operation. 



PRELIMINARY 



Nx586™ Processor 



33 















































Nx586 Features and Signals 



IMexGen™ 



Nx586 Alphabetical Signal Summary 



ALE* 


0 


Address Latch Enable 


ANALYZEIN 


I 


Analyze In 


ANALYZEOUT 


O 


Analyze Out 


AREQ* 


o 


Alternate-Bus Request 


CADDR<17:3> 


o 


L2 Cache Address 


CBANK<1:0> 


0 


L2 Cache Bank 


CDATA<63:0> 


I/O 


L2 Cache Data 


CKMODE 


I 


Clock Mode or L2 Synchronous Clock output 


COEA* 


0 


L2 Cache Output Enable A 


COEB*(WE*) 


o 


L2 Cache Output Enable B or Synchronous SRAM global write 


CWE<7:0>* 


0 


L2 Cache Write Enable 


DCL* 


0 


Dirty Cache Line 


GALE 


I 


Group Address Latch Enable 


GATEA20 


I 


Gate Address 20 


GBLKNBL 


I 


Group Block (Burst) Enable 


GDCL 


I 


Group Dirty Cache Line 


GNT* 


I 


Grant NexBus^ 


GREF 


I 


Ground Reference 


GSHARE 


I 


Group Shared Data 


GTAL 


I 


Group Try Again Later 


GXACK 


I 


Group Transfer Acknowledge 


GXHLD 


I 


Group Transfer Hold 


HROM 


I 


Reserved 


INTR* 


I 


Maskable Interrupt 


IREF 


I 


Clock Input Reference 


LOCK* 


0 


Bus Lock 


NC 


- 


Reserved 


NMI* 


I 


Non-Maskable Interrupt 


NPIRQ* 


o 


Reserved 


NREQ* 


o 


NexBus^ Request 


NxAD<63:0> 


I/O 


Bus Address/Status, or Bus Data 


NxADINUSE 


o 


Reserved 


NxCLK 


I 


NexBus-’ Clock 
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OWNABL 


I 


Ownable 


P4REF 


O 


Power Reference 


PHE1 


I 


Clock Phase 1 


PHE2 


1 


Clock Phase 2 


POPHOLD 


I 


Reserved 


PTEST 


I 


Reserved 


PULLHIGH 


I/O 


Reserved 


PULLLOW 


I 


Reserved 


RESET* 


I 


Global Reset (Power-Up Reset) 


RESETCPU* 


I 


Reset CPU (Soft Reset) 


SCLKE 


I 


Synchronous SRAM Clock Enable (CKMODE) 


SERIALIN 


o 


Serial In 


SERIALOUT 


o 


Serial Out 


SHARE* 


o 


Shared Data 


SLOTID<3:0> 


I 


NexBus Slot ID 


SRAMMODE 


I 


L2 Cache SRAM Type Select 


TESTPWR* 


I 


Test Power 


TPH1 


I 


Test Phase 1 Clock 


TPH2 


I 


Test Phase 2 Clock 


VDDA 


I 


PLL Analog Power 


XACK* 


o 


Transfer Acknowledge 


XBCKE* 


0 


NexBus-Transceiver Clock Enable 


XBOE* 


0 


NexBus-Transceiver Output Enable 


XHLD* 


0 


Transfer Hold 


XCVERE* 


I 


Internal NexBus Transceiver Enable 


XNOE* 


o 


NexBus-Transceiver Output Enable 


XPH1 


o 


Processor Clock Phase 1 


XPH2 


o 


Processor Clock Phase 2 


XREF 


o 


Clock Output Reference 


XSEL 


I 


Clock Mode Select 
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Hardware Architecture 



The Nx586 processor and the optional integrated floating-point execution unit are tightly coupled 
into a parallel architecture with a distributed pipeline, distributed control, and rich hierarchy of 
storage elements. While the features of the two devices are sometimes listed separately elsewhere in 
this book, they are treated as an integrated architecture in this chapter. Both the Nx586 and Nx586 
with the floating point have the identical system bus architecture. Therefore, the two devices are 
interchangeable within the processor socket. 



Bus Structure 

The Nx586 processor supports two external 64-bit buses: the processor bus, the L2 cache bus, and 
one internal 64-bit bus (for the integrated floating-point unit). All buses are synchronous to the 
NxCLK clock. The internal floating-point unit bus operates at the same speed as the processor core 
or twice the frequency of the local bus. 




Figure 14 Nx586 Bus Structure Diagram 
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Processor Bus 

The Nx586 supports two local bus interfaces, NexBus and NexBus^. NexBus is considered a true 
CPU local bus. Where as, NexBus^ is a NexGen proprietary system bus. During RESET* active, 
the XCVERE* pin is sampled for the local bus mode. XCVERE* determines what type of bus is 
generated by the processor. When pulled high, the Nx586 will generate the NexBus standard which 
requires external transceivers to connect the processor to the NexBus^ system bus. Figure 15 is a 
system block diagram showing the Nx586 configured with a NexBus interface (XCVERE* = 1). The 
NexBus transceivers are high speed non-inverting registered transceivers controlled by signals 
provided by the Nx586. 




NexBus5 Address and Data 
NxAD <63:0> 




Standard Buses (VL, PCI, ISA, EISA, MCA, etc.) 



Figure 1 5 Nx586 Basic System Diagram 
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Another approach for processors configured with external NexBus transceivers is to include the 
transceiver within the systems logic. This however forces the systems logic to provide complete 
arbitration and signal routing to the processor via NexBus. As shown in figure 16, the example PC- 
AT compatible systems controller contains the system arbiter, the memory controller, the VL-Bus 
controller, and the ISA-Bus controller. The example systems logic is completely responsible for bus 
interconnections between NexBus, the memory bus, VL-Bus and ISA bus. The Integrated Peripheral 
Controller contains the interrupt controller, DMA controller, CMOS memory, timers and counters. 
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When XCVERE* is tied low, the Nx586 generates NexBus^ directly. NexBus^ is a 64-bit 
synchronous, multiplexed bus that supports all signals and bus protocols needed for cache-coherency. 
A modified write-once MESI protocol is used for cache coherency. The processor continually 
monitors the NexBus^ to guarantee cache coherency. 




Figure 17 Example System with the Nx586 and NexBus 5 . 



The Nx586 based PCI system shown in figure 17 uses a chipset to connect the Nx586 via the 
NexBus^ system bus. The example chipset is divided into two major components, the memory and 
systems controller. The system controller contains the NexBus^ arbiter, and the PCI bus controller. 
Note, both the memory controller and the system controller are NexBus^ devices and can respond 
directly to the processor. The ISA bus is generated by a PCI to ISA bridge chip. 
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L2 Cache Bus 

The 64-bit L2 cache bus is dedicated to external SRAM cache. The bus carries one to eight bytes of 
cache data, or the tags and state bits for one to four cache banks (sets). The L2 cache write-policy 
can be programmed for write-back or write-through. Optionally, at power-on reset the L2 cache 
controller can be programmed for synchronous or asynchronous SRAMs. Bus accesses for each 
mode are identical except for the existence of the SRAM clocking signal on CKMODE for the 
synchronous SRAMs. Note, the synchronous SRAMs operate at the same frequency as the processor 
not at half the frequency. 

The processor manages cache-coherency for both L2 and LI caches. The 64-bit L2 cache bus is 
fully isolated and decoupled from the processor local bus also known as NexBus. The L2 cache bus 
does not require any arbitration to gain control of the bus. In fact, the L2 cache controller can start a 
L2 cache cycle on any processor clock. In addition, speculative cycles are supported on the L2 
cache bus. The processor can request data from the L2 cache controller and terminate the cycle at 
any time during the access. 32-bytes is the unit of transfer between the memory and the cache. 
There is no data bursting from the L2 cache memory. Since no arbitration is necessary, the LI cache 
line fills are just back-to-back read cycles. 



Internal 64-bit Execution Unit Bus 

The Nx586 contains an internal 64-bit bus dedicated to the optional floating-point execution unit. 
Discrete arbitration signals implement a simple protocol between the two devices. Arbitration 
priority is given to the processor, so reads prevail over writes. The winner gets the bus on the next 
clock. The arbitration and data transfers are pipelined one clock apart at the processor-clock 
frequency. Thus, in every processor clock, both a bus request and a data transfer can be performed, 
making the Floating-Point execution unit a tightly coupled component of the execution pipeline. 

Both the processor core and the Floating-Point execution unit sometimes make speculative requests 
for the local bus (NexBus). For example, the processor requests the bus while it concurrently looks 
in its cache for the data to be transferred. The Floating-Point execution unit makes speculative 
requests concurrently with its first pass at formatting the output, which may in fact need further 
formatting before transfer. If either device finds that it cannot use the bus after requesting it, it 
negates its request signal thereby allowing access to the bus by the other device. 
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Operating Frequencies 

There are four operating frequencies associated with the processor, as shown in Figure 18: 

■ NexBus/NexBus $ — Operates at the frequency of the system clock (NxCLK). 

■ Processor — Operates at twice the frequency of the NxCLK. The Nx586 processor and the 
Floating Point Execution Unit both operate at the same frequency. 

■ LI (On-Chip) Cache — Operates at twice the frequency of the processor clock. 

■ L2 ( Off- Chip) Cache — Operates at the same frequency as the NxCLK. Transfers between 
L2-cache and the processor occur at the peak rate of one octet every two processor clocks, 
but the transfers (which can be back-to-back) can begin on any processor clock. Data is 
returned to the processor on the third clock phase after an access is started. 

Unless otherwise specified in this book, a clock cycle means the Nx586 processor's clock cycle. 
However, most of the timing diagrams in the Bus Operations chapter are relative to the NxCLK 
clock, not the processor clock. 

Figure 18 shows the relative clocking frequencies for a Nx586 processor. The NxCLK clock 
determines the systems overall operating speed. The NxCLK clock sets the NexBus/NexBus^ 
operating frequency. The processor's on-board PLL doubles the frequency of NxCLK making the 
Nx586 operate at twice the frequency of NexBus. The dual port nature of the LI caches requires the 
LI cache controllers to operate at double the frequency of the processor. The effective operating 
frequency of the L2-cache is half of the processor. 



NexBus Clock (NxCLK) {1^2) 
Nx586 Processor ( 1 )( 2 X~ 3 ~X 4 ) 



LI Cache ( la X""Ti)~X 2a X 2b X~3a"~)( 3b X 4a )( 4b ) 



L2 Cache ( 1 2 ) 



Figure 1 8 System Clocking Relationships 

The processor uses an on-chip phase-locked loop and NxCLK to internally generates a two phase 
non-overlapping clock, shown in Figure 18 as the phases that drive the LI cache. Most of the 
processor’s pipeline stages operate on these phases. For example, a register-file access, an adder 
cycle, a lookup in the translation lookaside buffer (TLB), and an on-chip cache read or write all take 
a single phase of the processor clock. 
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Figure 19 shows the relationship between functional units within the Nx586 processor. The main 
processing pipeline is distributed across five units: 

■ Decode Unit 

" Address Unit 

■ Cache and Memory Unit 

■ 2 Integer Units 

■ Floating Point Execution Unit (optional) 

All functional units work in parallel with a high degree of autonomy, concurrently processing 
different parts of several instructions. Only the Cache and Memory Unit has an interface (NexBus or 
NexBus-*) that is visible outside the processor. 
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Figure 19 Nx586 Internal Architecture 
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Storage Hierarchy 

The Nx586 architecture provides a rich hierarchy of storage mechanisms designed to maximize the 
speed at which functional units can access data with minimum bus traffic. Control for a modified 
write-once cache-coherency protocol (MESI) is built into this hierarchy. 

In addition to the LI and L2 caches, the processor also has three other storage structures that 
contribute to the speed of accessing information: (1) a prefetch queue in the Decode Unit, (2) branch 
prediction capability in the Decode Unit, and (3) a write queue in the Cache and Memory Unit. The 
storage hierarchy can continue at the system level with other buffers and caches. For example, a 
system using a memory controller chip that maintains a prefetch queue between the L2 cache and 
main memory can continuously pre-load cache blocks in anticipation of the processor's next request 
for a cache fill. Bus masters on buses interfaced to the NexBus can also maintain caches, but those 
other masters must use write-through caches. 

Figure 20 shows this hierarchy during a read cycle in a system. Figure 21 shows the analogous 
organization during a write cycle. All levels of cache and memory are interfaced through 64-bit 
buses. Physically, transfers between L2 cache and main memory go through the processor via 
NexBus, and transfers between LI and L2 cache go through the processor via the dedicated L2-cache 
bus. While the NexBus^ is multiplexed between address/status and data, the L2-cache data bus 
carries only data at 64 bits every NexBus^ clock cycle. The disk subsystem and software disk cache 
are included in the figures for completeness of the hierarchy; the software disk cache is maintained 
in memory by some operating systems. 
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Figure 20 Storage Hierarchy (Reads) 
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Figure 21 Storage Hierarchy (Writes) 
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Transaction Ordering 

Interlocks enforce transaction ordering in a manner that optimizes read accesses. Interlocks are 
conditions where the execution of one function is deferred until the conflicting function has 
completed execution. With the exceptions detailed below, the general rules for transaction ordering 
are: 

■ Memory Reads — Memory reads (whether cache hits or reads on NexBus) are re-ordered 
ahead of writes, are performed out of order with respect to other reads, and are done 
speculatively. With respect to the most recent copy of data, the write queue takes priority 
over the cache. A hit in the write queue is serviced directly from that queue. 

■ I/O and Memory-Mapped I/O Reads — I/O reads are not done speculatively because they can 
have side effects in memory that may cause the I/O read to be done improperly. I/O reads 
have higher priority than memory reads, but all pending writes are completed first. 

■ All Writes — Writes are performed in order with respect to other writes, and they are never 
performed speculatively. Writes are always held in the write queue until the processor 
knows the outcome of all older instructions. 

■ Locked Cycles — Locked read-modify-writes are stalled until the write queue is emptied. 

■ Cache-Hit Reads — The processor holds reads that hit in the cache if any of the following 
conditions exist: 

— The cache entry depends upon pending writes that have not yet 
received their data, are mapped as non-cacheable or are mapped as 
write-protected. 

— The read is locked (hence, the rules below for Memory Reads on 
NexBus are followed). 

■ Memory Reads on NexBus — The processor holds memory reads on NexBus (cache misses) 
if any of the following conditions exist: 

— Reads are I/O or Memory-Mapped I/O. 

— The write queue has pending writes to I/O or to memory that are 
mapped as non-cacheable I/O. 

— The read is locked, and the write portion of a previous locked read- 
modify-write has not yet been performed. 
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Characteristics 

The cache and memory subsystem is a key element in the processor's performance. Each of the two 
on-chip LI caches (instruction and data) are 16kB in size and dual-ported. The L2 cache is either 
256kB or 1MB and single-ported. It can be built from an array of eight asynchronous (using 8-bit 
devices) or four synchronous (using 16/18-bit devices), or two synchronous (using 32/36-bit devices) 
SRAMs. The L2 cache stores instructions and data in 32-byte cache blocks (lines), each of which 
has an associated tag and cache-coherency state. Separate external tag RAMs are not used. Instead, 
tag data is stored in a small part of the L2 cache. L2 is a random-access cache, with the L2 cache 
controller coupled very closely to the processor. Memory references of any kind can be interleaved 
without compromising performance. It responds to random accesses just as quickly as to block 
transfers. 32-bytes is the unit of transfer between memory and cache. 
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Figure 22 Cache Characteristics 

If a write needs to go to NexBus for cache-coherency purposes, it does so before it goes to a cache. 
Whether the write is needed on NexBus depends on the caching state of the data: if the data is shared 
(as described later in the Cache Coherency section), all other NexBus^ caching devices need to know 
about the imminent write so that they can take appropriate action. The processor's caches can be 
configured so that specified locations in the memory space can be cacheable or non-cacheable and 
read/write or read only (write-protected). 

The Cache and Memory Unit contains a write queue that stores partially and fully assembled writes. 
The queue serves several functions. First, it buffers writes that are waiting for bus access, and it 
reorders writes with respect to reads or other more important actions. Second, it assembles the pieces 
of a write as they become available. (Addresses and data arrive at the queue separately as they come 
out of the distributed pipelines of other functional units.) Third, the queue is used to back out of 
instructions when necessary. All writes remain in the queue until signaled by the Decode Unit that 
the instruction associated with the write is retired — i.e., that there is no possibility of an instruction 
backout due to a branch not taken or to an exception or interrupt during execution. 
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Reads are looked up in the write queue simultaneously with the LI cache lookup. A hit in the write 
queue is serviced directly from that queue, and write locations pending in the queue take priority 
over any LI -cache copy of the same location. Reads coming into the unit from NexBus are routed in 
a pipeline to the processor L2 cache and LI caches. Reads coming in from the L2 cache are routed 
first to the processor, then to the LI caches. Write-backs go only to NexBus. Pending writes in the 
queue go first to the LI caches (both the instruction and data caches can be written), then to L2 if 
necessary, then to NexBus if necessary. 

The dual ports on the LI instruction and data caches protect the processor from stalls. In a single 
clock, the processor can read from port A on each cache while it reads or writes port B on each 
cache, such as for cache lookups, cache fills, and other cache housekeeping overhead. Both LI 
caches may contain identical data, as when a 32-byte cache block contains both instructions and data 
and is loaded into both LI caches in different cache-block reads. 



Level-2 Cache Power-on RESET Configurations 



The Nx586 supports two types of Level-2 cache SRAMs, asynchronous and synchronous. 
Asynchronous SRAMs should be implemented for low speed cost effective system designs. On the 
other hand, synchronous SRAMs need to be used for systems operating the Nx586 at very high 
speeds (typically, 100MHz or higher). A group of pins are examined at power-on RESET to 
determine what mode the L2 cache controller is operating. The key pin is SRAMMODE. If 
SRAMMODE is pulled high, the on-chip L2 cache controller is configured for synchronous SRAMs. 
In synchronous SRAM mode, the CKMODE pin generates the SRAM clocks and COEB* generates 
global L2 SRAM write enables after RESET is inactive. While in synchronous SRAM mode, 
SCLKE is used to determine the output of CKMODE. If SCLKE is asserted, CKMODE generates a 
clock equal to the processor's internal frequency (double NxCLK). Due to loading and the loss of 
COEB* in synchronous mode, the type of synchronous SRAMs necessary are wide I/O or 32/36-bit. 
The basic access style for the synchronous SRAMs is "Flow Through". While SCLKE is inactive, 
CKMODE is driven low. When SRAMMODE is left unconnected (floating), the internal pull down 
resistor configures the Nx586 for asynchronous SRAMs. Figure 23 is a shows how to configure the 
Nx586 for the particular L2 Cache SRAM mode desired. 
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Figure 23 L2 Cache SRAM Interface Modes 
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Cache Coherency 

The processor monitors (snoops) NexBus^ operations by bus masters to guarantee coherency with 
data cached in the processor's L2 cache, LI caches, and branch prediction logic. A type of write- 
invalidate cache-coherency protocol called modified write-once (MWO) or modified, exclusive, 
shared, or invalid (MESI) is used. In this protocol, each 32-byte block in the L2 cache is in one of 
four states: 

■ Exclusive — Data copied into a single bus-master's cache. The master then has the exclusive 
right (not yet exercised) to modify the cached data. Also called owned clean data. 

■ Modified — Data copied into a single bus-master's cache (originally in the exclusive or 
invalid state) but that has subsequently been written to. Also called dirty, or stale data. 

■ Shared — Data that may be copied into multiple bus-masters' caches and can therefore only 
be read, not written. 

■ Invalid — Cache locations in which the data is not correctly associated with the tag for that 
cache block. Also called absent or not present data. 

The protocol allows any NexBus^ caching device to gain exclusive ownership of cache blocks, and 
to modify them, without writing the updated values back to main memory. It also allows caching 
devices to share read-only versions of data. To implement the protocol, the processor: 

■ Requests data in a specific state by asserting or negating NexBus^ cache-control bits. 

■ Caches data in a specific state by watching NexBus^ cache-control input signals from 
system logic and the slave being accessed. 

■ Snoops NexBus^ to detect other NexBus^ transactions that hit in the processor's caches. 

■ Intervenes in the operations of other NexBus^ devices to write back modified data to main 
memory if a hit occurs during a bus snoop. 

■ Updates the state of cached blocks if a hit occurs during a bus snoop. 

The protocol name, write-once, reflects the processor's ability to obtain exclusive ownership of 
certain types of data by writing once to memory. If the processor caches data in the shared state and 
subsequently writes to that location, a write-through to memory occurs. During the write-through, all 
other caching devices with shared copies invalidate their copies (hence the name, write-invalidate). 
After the write, the processor owns the data in the exclusive state, since the processor has the only 
valid copy and it matches the copy in memory. Any additional writes are local — they change the 
state of the cached data to modified, although the changes are not written back to memory until an 
update or cache replacement snoop cycle by another bus master forces the write-back. Write-once 
protocols maximize the processor's opportunities to cache data in the exclusive (owned) state even 
when the processor has not specifically requested exclusive use of data, thereby maximizing the 
number of transactions that can be performed from the cache. 

There are also other means of obtaining ownership of data besides writing to memory, and write 
operations can be performed in a way that does not modify ownership. The protocol is compatible 
with caching devices that employ write-through caching policies, if the devices implement bus 
snooping and support cache-block invalidation. Caching devices that use a cache-block (line) size 
other than four-q words must use a write- through policy. 
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State Transitions 

Transitions among the four states are determined by prior states, the type of access, the state of 
cache-control signals and status bits, and the contents of configuration registers associated with the 
cache. Figure 24 shows only the basic state transitions for write-back addresses. Transitions occur 
when the processor reads or writes data (hits and misses), or when it encounters a snoop hit. No 
transitions are made for snoop misses. In the default processor configuration and depending on the 
cause of an operation, reads can be either for exclusive ownership or shared use, but write misses are 
allocating (fetch on write) — they initiate a read for exclusive ownership, followed by a cache write. 
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Figure 25 describes the primary signals and status bits that affect the state transitions shown in Figure 
24. The OWN* and SHARE* signals control many transitions. The assertion of OWN* implies that 
the data is both snoopable (SNPNBL) and cacheable (CACHBL). Figure 26 describes the signals 
and status bits that affect processor responses during bus snooping. The four sections following these 
tables describe the characteristics of the states in more detail. 



OWN* 


I/O 


Ownership Request — Asserted by a master when it intends 


NxAD<49> 




to cache data in the exclusive state. The bit is asserted for 


address phase 




write-backs and reads from the stack. If such an operation 
hits in the cache of another master, that master writes its data 
back (if copy is modified) and changes the state of its copy 
to invalid. If OWN* is negated during a read or write, 
another master may not assume that the copy is in shared 
state when not asserting SHARE* signal. 


OWNABL 


I 


Ownable — Asserted by the system logic during accesses by 
the processor to locations that may be cached in the exclusive 
state. Negated during accesses that may only be cached in 
the shared state, such as bus-crossing accesses to an address 
space that cannot support the MESI cache-coherency 
protocol. All NexBus' addresses are assumed to be 
cacheable in the exclusive state. 






The OWNABL signal is provided in case system logic needs 
to restrict caching to certain locations. In systems using logic 
that does not have an OWNABL signal, the processor's 
OWNABL input is typically tied high for write-back 
configurations to allow caching in the exclusive state on all 
reads. 


SHARE* 


0 


Shared Data — SHARE* is asserted by any NexBus^ master 


GSHARE 


I 


during block reads by another NexBus^ master to indicate to 
the other master that its read hit in a block cached by the 
asserting master, and that the data being read can only be 
cached in the shared state, if OWN* is negated. GSHARE is 
the backplane NAND of all SHARE* signals. If GSHARE 
and OWN* are both negated during the read, the data may be 
promoted to the exclusive state because no other NexBus^ 
device declared via SHARE* that it has cached a copy. Code 
fetches will stay in the shared state. 



Figure 25 Cache State Controls 
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SNPNBL 

NxAD<57> 


I/O 


Snoop Enable — Asserted to indicate that the current 
operation affects memory that may be valid in other caches. 
When this signal is negated, snooping devices need not look 
up the addressed data in their cache tags. This signal is 
negated by the processor on write-backs. 


DCL* 


0 


Dirty Cache Line — Asserted during operations by another 


GDCL 


I 


master to indicate that the processor has cached the location 
being accessed in a modified (dirty) state. 

During reads, the requesting master's cycle is aborted so that 
the processor, as an intervenor, can preemptively gain 
control of the NexBus and write back its modified data to 
main memory. While the data is being written to memory, 






the requesting master reads it off the NexBus^. The assertion 
of DCL* is the only way in which atomic 32-byte cache- 
block fills by another NexBus^ master can be preempted by 
the processor for the purpose of writing back dirty data. 

During writes, the initiating master is allowed to finish its 
write. The NexBus^ Arbiter must then guarantee that the 
processor asserting DCL* gains access to the bus in the very 
next arbitration grant, so that the processor can write back all 
of its modified data except the bytes written by the initiating 
master. (In this case, the initiating master's data is more 
recent than the data cached by the processor asserting 
DCL*.) 



Figure 26 Bus Snooping Controls 



Invalid State 

After reset, all cache locations are invalid. This state implies that the block being accessed is not 
correctly associated with its tag. Such an access produces a cache miss. A read-miss causes the 
processor to fetch the block from memory on the NexBus and place a copy in the cache. If OWN* is 
negated and GSHARE is asserted, the block changes state from invalid to shared, provided that the 
memory slave asserts the GBLKNBL signal when each qword is transferred. If the processor asserts 
OWN* when OWNABL is asserted, or if no other caching device shares the block (GSHARE 
negated), the processor may change the state of the block from invalid to exclusive. If GBLKNBL is 
negated, the data may be used by the processor but it will not be cached, and the cache block will 
remain invalid. 
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The processor will invalidate a block if another master performs any operation with OWN* asserted 
that addresses that block, and OWNABL and GXACK are simultaneously asserted. If the block's 
previous state was modified, the processor will also intervene in the other master's operation to write 
back the modified data. 

Shared State 

When the processor performs a read with OWN* negated and GSHARE asserted, and the read misses 
the cache, the block will be cached in the shared state. The shared state indicates that the cache block 
may be shared with other caching devices. A block in this state mirrors the contents of main 
memory. When the processor has cached data in the shared state, it snoops NexBus memory 
operations by other masters, ignoring only operations for which SNPNBL is negated. When the 
processor performs block reads that hit in a block shared with another master, that master asserts 
SHARE*. 

When the processor performs a write with OWN* negated — or when it performs a write with OWN* 
asserted, OWNABL negated, and GXACK asserted — other masters may either invalidate their copy 
or update it and retain it in the shared state. 

When the processor performs a write to a shared block, the processor (1) writes the data through to 
main memory while asserting OWN* so as to cause other caching masters to invalidate their copies, 
(2) updates its cache to reflect the write, and (3) if OWNABL and GXACK are both asserted during 
the write, the processor changes the state of the block to exclusive, otherwise the state remains 
shared. 

If the processor performs a read or write in which OWN*, OWNABL, and GXACK are all asserted, 
other masters invalidate their copy of such blocks. 

Exclusive State 

When the processor performs a read with OWN* asserted or GSHARE negated, and the read misses 
the cache, the block will be cached in the exclusive (owned clean) state. In the exclusive state, as in 
the shared state, the contents of a cache block mirrors that of main memory. However, the processor 
is assured that it contains the only copy of the data in the system. Thus, any subsequent write can be 
performed directly to cache and need not be immediately written back to memory. The cache block 
so modified will then be in the modified state. Just as with shared cache blocks, the processor snoops 
NexBus memory operations when it has cached data in the exclusive state, except when SNPNBL is 
negated. 

If another master asserts OWN* while hitting in an exclusive block in the processor, the processor 
invalidates its copy. A read by another master with OWN* negated that hits in an exclusive block 
forces the processor to assert SHARE* and change the block to the shared state, if CACHBL is 
asserted. If a write by another master hits in an exclusive block, the processor invalidates the block. 
OWNABL has no effect on snooping the exclusive and modified states, since a cache block could 
not have been cached in these states if the block were not ownable. 
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Modified State 

The modified (owned stale or dirty) state implies that a cache block previously fetched in the 
exclusive state has been subsequently written to and no longer matches main memory. As in the 
exclusive state, the processor is assured that no other master has cached a copy so the processor can 
perform writes to the cache without writing them to memory. 

Reads and single-qword writes by other masters that address a modified block cause the processor to 
assert DCL* and perform an intervenor operation. The processor writes back its cached data to 
memory and the other master simultaneously reads it from the NexBus. 

During external non-OWN* reads, the processor changes its copy of the block to the shared state. If 
an external non-OWN* single-qword write with CACHBL asserted hits in a modified block, the 
processor asserts DCL* and intervenes in the operation. The processor then either asserts SHARE* 
or invalidates the block during the operation. For external block writes (unlike the single-qword 
writes described above), the processor does not perform an intervenor operation with a write-back 
because the other master overwrites the entire cache block(s). If an external block write hits a 
modified processor block it invalidates the block. 

Internal reads or writes do not change the state of a modified block. However, if another master 
attempts to write to a block that has been modified by the processor, the modified data (or portions 
thereof) is written back to memory. During the write-back, the processor negates SNPNBL to relieve 
other caching devices of the obligation to look the address up in their caches, since a modified block 
can never be in another cache. 



Interrupts 

The processor supports maskable interrupts on its INTR* input, non-maskable interrupts on its NMI* 
input, and software interrupts through the INT instruction. Hardware interrupts (INTR* and NMI*) 
are asynchronous to the NxCLK clock. They are asserted by external interrupt control logic when 
that logic receives an interrupt request from an I/O device, system timer, or other source. When an 
active non-maskable interrupt request is sensed by the interrupt controller, the request is passed to 
the processor which then performs an interrupt acknowledge sequence, as defined in the Bus 
Operations chapter. Maskable interrupt requests must be asserted until cleared by the interrupt 
service routine. 

Systems logic using the 82C206 integrated peripheral controller (IPC) or equivalent, use the IPC to 
handle interrupts. The systems logic typically generates the non-maskable interrupt (NMI*) input to 
the processor, and it passes along the processor's non-maskable interrupt acknowledge to the 82C206 
via a INTA* output. 

For Nx586s with the optional floating-point execution unit, the Nx586 generates unmasked floating 
point error interrupts on the NPIRQ* pin. The NPIRQ* function is included for PC-AT 
compatibility. This pin is typically inverted and then connected to IRQ13. Floating-point errors are 
cleared in the same manner as in a compatible PC-AT. However, the Nx586 detects and traps the 
I/O writes which normally clears the error and performs the clearing internally. Therefore, the 
supporting AT compatible chipset does not require a dedicated signal to the processor to clear 
floating-point errors. 
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Clock Generation 

Five signals determine the manner in which the processor's internal clock phases (PHI and PH2) are 
derived or provided. These signals include CKMODE, XSEL, NxCLK, PHE1, and PHE2. These 
signals determine one of four modes: Phase-Locked Loop (the normal operating mode), External 
Phase Inputs, Reserved, or External Processor Clock, as shown in Figure 27 and described in the 
sections below. Note, each clocking mode is determined at power-on RESET*. PHE2 determines 
the relationship between the internal non-overlapping clocks. When pulled low, narrow non- 
overlapped clocks are generated. Wide non-overlapped clocks are produced for PHE2 pulled high. 



Mode Type 


Mode # 


RESET* 


CKMODE 


XSEL 


PHE1 


Phase-Locked Loop 
(normal operating mode) 


0 
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0 


0 


0 


External Processor Clock 
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0 


1 


Input at 2x the 

NxCLK 

frequency 


Reserved Mode 


2 


T 


1 


0 




External Phase Inputs 


3 


t 


1 


1 


Externally 
supplied at 2x 
the NxCLK 
frequency 



Figure 27 Clocking Modes 



Mode #0: In the phase-locked loop mode, the internal clock phases are derived from the external 
NxCLK clock via a phase-locked loop (PLL). In all modes, the NxCLK input must be 
driven at one-half the processor's internal operating frequency so as to provide the bus- 
interface logic with a signal that defines the external clock cycle. 



Mode#l: In the external processor clock mode, the internal clock phases are derived from PHE1 
input signal. The PHE1 input signal operates at twice the frequency of NxCLK. The 
falling edge of the internal phase2 will occur before the rising edge of XREF, which is a 
buffered NxCLK output, and can be observed on the XPH2 output. This mode allows 
bypassing the internal PLL for test purposes or to change the clock frequency, as when 
entering or leaving a low-power mode. 

Mode #2: This is a reserved mode. 

Mode #3: In the external phase inputs mode, the internal clock phases are controlled by the two 
external phase inputs, PHE1 and PHE2. These inputs are buffered internally to drive the 
processor clock distribution system. 
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Bus Operations 



This chapter covers NexBus processor cycles, NexBus^ system bus cycles and cache-coherency 
operations. The processor bus cycles are conducted primarily on NexBus although their effects can 
also be seen on the L2 SRAM bus. The NxCLK clock, shown in the timing diagrams accompanying 
this text, runs at half the frequency of the processor's internal clock. 

In this chapter, the term "clock" refers to the NexBus clock not to the processor 

clock, as is meant elsewhere throughout this book. 

The notation regarding Source in the left-hand column of the timing diagrams shown in this section 
indicates the chip or logic that generates the signal. When signals are driven by multiple sources, all 
sources are shown, in the order in which they drive the signal. In some cases, signals take on 
different names as outputs are NANDed in group-signal logic. In these cases, the signal source is 
shown with additional notations, where the additional notations indicate the device or logic that 
originally caused the change in the signal. 



Level-2 Asynchronous SRAM Accesses 

Figure 18 in the Nx586 Hardware Architecture chapter compares the basic clock timing for the 
processor, its LI caches, and the L2 cache. An LI cache miss may cause an access to the L2 cache, 
which resides off-chip on a dedicated 64-bit bus. Figure 28 shows a read, write, and read to the L2 
cache. Transfers can begin on any processor clock and occur at the peak rate of eight bytes every 
two processor clocks. 

In addition, Figure 28 shows a read followed by a write followed by a read cycle. Reads (or writes) 
can be back-to-back without dead cycles. An idle cycle is shown after the last read. The processor 
clock, which runs at twice the rate of the NexBus clock (NxCLK), is represented here by its two 
phases, PHI and PH2. These phases are not visible at the pins except through the delayed outputs, 
XPH1 and XPH2. The data-sampling point is shown as the falling edge of PH2, which is relative to 
the rising edge of NxCLK. Two pins for COE* are shown, A and B. Both pins are identical in 
function and transition on the rising edge of PHI. The two pins are made available for loading 
considerations. 
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Figure 28 Levei-2 Asynchronous Cache Read and Write 

The L2 cache controller provides data to the processor in 3 CPU phases. In other words, the 
cache cycle time is 1.5 CPU clocks (one clock is equal to two phases). Data is provided to 
either the CPU core or the LI cache in 1.5 CPU clocks. L2 cache address generation occurs 
before the cycle starts. A total of 7.5 clocks are required for a cache line fill, as shown is 
figure 29. 







1 


Cache Read 


; Cache Read 


1 


Cache Read 


1 


Cache Read 


1 


s 


NxCLK 


1 






i r~ 


[ 


r - 




i r~ 


1 


p 


PHI 


n 


n 


PL 


_ n n 




n n 




ji n 




p 


PH2 


n 


— H 

n 


Sampling Point Relative to CLK 

n n n 


n 


n 


_TL_ 


n 




CADDR<17:3> 

rRAMk'^i 








1 












p 


zzr 


address 


]( address 


ir 


address 


X 


address 


_j 













P COEA* COEB* | 


1 


y 


sampling point 




r t 





P,L CDATA<63:0> ( ( data ](.'( data Kgg data { data } 



P CWEn* 

I Source: P=Processor, L=L2 cache, S=System logic 



Figure 29 Leve!-2 to Level-1 Asynchronous Cache Line Fill. 



58 



Nx586™ Processor 



PRELIMINARY 







NexGen 



Bus Operations 



Level-2 Synchronous SRAM Accesses 

The type of SRAMs required for synchronous mode are "Synchronous Flow Through" with wide I/O 
(32 or 36 bits). A single clocking pin, CKMODE is used to initiate the read/write operations. At the 
rising edge of CKMODE, all addresses, write-enables, chip selects and data are registered within the 
SRAM. It is assumed that new signals can be applied to the SRAMs prior to data out valid. Read 
data is sampled on the next rising edge of CKMODE (approximately two PH2 clocks later). A dead 
cycle for bus turn around time is provided during read followed by write cycles (approximately one 
PH2 clock). Figure 30 shows the signal relationships for the synchronous SRAM mode. 




Figure 30 Level-2 Synchronous Cache Read and Write cycles 
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NexBus and NexBus 5 Arbitration and Address Phase 

Processor operations on NexBus/NexBus^ may or may not begin with arbitration for the bus. To 
obtain the bus, the processor asserts NREQ*, LOCK*, and/or AREQ* to the arbiter, which responds 
to the arbitration winner with GNT*. Automatic re-grant occurs when the arbiter holds GNT* 
asserted at the time the processor samples it, in which case the processor need not assert NREQ*, 
LOCK*, or AREQ* and can immediately begin its operation. 

NREQ*, when asserted, remains active until GNT* is received from the arbiter. In systems using the 
systems logic that interfaces directly to NexBus, NREQ* is typically treated the same as AREQ*; 
when NexBus control is granted, control of all other buses is also granted at the same time. 

LOCK* is asserted during sequences in which multiple bus operations should be performed 
sequentially and uninterrupted. This signal is used by the arbiter to determine the end of such a 
sequence. Cache-block fills are not locked; they are implicitly treated as atomic reads. Arbiters 
may allow a master on another system bus to intervene in a locked NexBus transaction. To avoid 
this, the processor asserts AREQ*. LOCK* is typically software-configured to be asserted for read- 
modify- writes and explicitly locked instructions. 

AREQ* is asserted to gain control of the NexBus^ or any other buses supported by the system. This 
signal always remains active until GNT* is received. 

When GNT* is received, the processor places the address of a qword (for memory operations) on 
NxAD<31:3> or the address of a dword (for I/O operations) on NxAD<15:2>. It drives status bits on 
NxAD<63:32> and asserts its ALE* signal to assume bus mastership and to indicate that there is 
valid address on the bus. The processor asserts ALE* for only one bus clock. The slave uses the 
GALE signal generated by system logic to enable the latching of address and status from the 
NexBus^. 



NexBus Basic Operations 

The Nx586 supports two local bus interfaces, NexBus and NexBus^. NexBus is considered a true 
CPU or processor local bus. Where as, NexBus^ is a NexGen proprietary system bus. During 
RESET* active, the XCVERE* pin is sampled for the local bus mode. XCVERE* determines what 
type of bus is generated by the processor. When pulled high, the Nx586 will generate the NexBus 
standard which requires external transceivers to connect the processor to the NexBus^ system bus. 
Figure 31 and 32 show the NexBus transceiver control signals for basic QWORD Read and Write 
operations. AD<63:0> is the multiplexed NexBus processor local bus while NxAD<63:0> is the 
multiplexed NexBus^ system local bus. 
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Figure 31 Fastest NexBus Single-Qword Read 
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Figure 32 Fastest NexBus Single-Qword Write 
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NexBus 5 Single-Qword Memory Operations 

Figure 33 shows the fastest possible NexBus^ single-qword read. The notation regarding Source 
indicates the logic that originated the signal as an output. In this figure and others to follow, the 
source of group signals (such as GXACK) is shown with additional symbols indicating the device or 
logic that output the originally activating signal. For example, the source of the GXACK signal is 
shown as "S,P", which means that system logic (S) generated GXACK but that the processor (P) 
caused this by generating XACK*. In some timing diagrams later in this section, bus signals take on 
different names as outputs cross buses through transceivers or are ORed in group-signal logic; in 
these cases, the source of the signals is shown with additional symbols indicating the logic that 
originally output the activating signals. 

The data phase of a fast single-qword read starts when the slave responds to the processor's request 
by asserting its XACK* signal. The processor samples the GXACK and GXHLD signals from system 
logic to determine when data is placed on the bus. The processor then samples the data at the end of 
the bus clock after GXACK is asserted and GXHLD is negated. The operation finishes with an idle 
phase of at least one clock. 

This protocol guarantees the processor and other caching devices enough time to recognize a 
modified cache block and to assert GDCL in time to cancel a data transfer. A slave may not assert 
XACK* until the second clock following GALE. However, the slave must always assert XACK* 
during or before the third clock following GALE, since otherwise the absence of an active GXACK 
indicates to the systems logic interface between the NexBus^ and other system buses (called the 
alternate-bus interface) that the address must reside on the other system bus. In that case, the 
systems logic interface to that other bus assumes the role of slave and asserts GXACK. 

Figure 33 shows when GBLKNBL may be asserted. If appropriate, the slave must assert GBLKNBL 
no later than it asserts XACK*, and it must keep GBLKNBL asserted until it negates XACK*. It 
must negate GBLKNBL at or before it stops placing data on the bus. Although not shown, 
OWNABL must also be valid (either asserted or negated) whenever GXACK is asserted. In the 
example shown in Figure 34, the slave asserts GXACK at the latest allowable time, thereby 
effectively inserting one wait state. The slave may or may not drive the NxAD<63:0> signals during 
the wait states. The processor will not drive them during the data phase of a read operation. 
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Figure 33 Fastest NexBus 5 Single-Qword Read 




If the slave is unable to supply data during the next clock after asserting XACK*, the slave must 
assert its XHLD* signal at the same time. Similarly, if the processor is not ready to accept data in 
the next clock it asserts its XHLD* signal. The slave supplies data in the clock following the first 
clock during which GXACK is asserted and GXHLD is negated. The processor strobes the data at 
the end of that clock. A single-qword read with wait states is shown in Figure 35 and 36. For such 
an operation, the slave must negate XACK* after a single clock during which GXACK is asserted 
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and GXHLD is negated, and it must stop driving data onto the bus one clock thereafter. The 
processor does not assert XHLD* while GALE is asserted, nor may either party to the transaction 
assert XHLD* after the slave negates GXACK. In the example shown in Figure 35, the slave asserts 
GXACK at the latest allowable time, thereby inserting one wait state, and GXHLD is asserted for 
one clock to insert an additional wait state. The slave may or may not drive the NxAD<63:0> signals 
during the wait states. The processor will not drive them during the data phase of a read operation. 
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Figure 35 NexBus 5 Single-Qword Read with Wait States using a delayed GXACK 
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A single-qword write operation is handled similarly. Figure 37 illustrates the fastest write operation 
possible. Figure 38 shows a single-qword write with wait states. After the bus is granted, the 
processor puts the address and status on the bus and asserts ALE*. As in the read operation, the slave 
must assert its XACK* signal during either the second or third clock following the assertion of 
GALE. If the slave is not ready to strobe the data at the end of the clock following the assertion of 
GXACK, it must assert its XHLD* signal. The processor places the data on the bus in the clock after 
the assertion of GXACK, which may be as soon as the third clock following the assertion of GALE. 
The slave samples GXHLD to determine when the data is valid. The processor will drive data as 
soon as it is able, and it continues to drive the data for one (and only one) clock after the 
simultaneous assertion of GXACK and negation of GXHLD. As in the read operation, the slave's 
XACK* is asserted until the clock following the trailing edge of GXHLD. 
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Figure 38 NexBus 5 Single-Qword Write With Wait States 
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NexBus 5 Cache Line Memory Operations 

The processor performs cache line fill or block operations with memory at a much higher bandwidth 
than the single-qword operations described in the previous section. Block operations, both reads and 
writes, are done only in four-qword increments (32-bytes). All cache line reads are cache fills. 

Cache line reads and writes are indicated by the assertion of BLKSIZ* during the address/status 
phase of the bus operations, as previously defined for single-qword operations. 

A cache line operation consists of a single address phase followed by a multi-transfer data phase. The 
data transfer may begin with any qword in the block, as indicated by the address bits, but it then 
proceeds through additional qwords of the specified contiguous data in any order. 



NexBus 5 I/O Operations 

I/O operations on the NexBus^ are performed exactly like single-qword reads and writes, with three 
exceptions. First, the I/O address space is limited to 64K bytes. Second, the 16-bit I/O address is 
broken into two fields: fourteen address bits and two byte-enable bits. I/O addresses do not use 
BE<7:2>* (which must be set to all l's) but instead specify a quad address on NxAD<2>. Third, data 
is always transferred on NxAD<15:0>, and NxAD<63:16> is undefined during the data transfer 
phase of an I/O operation. 

I/O operations are indicated by driving 010 (data read) and Oil (data write) on NxAD<48:46> and 
all zeros on NxAD<31:16> when GALE is asserted. I/O space is always non-cacheable, so a slave 
should never assert GBLKNBL when responding to an I/O operation. 



NexBus 5 Interrupt-Acknowledge Sequence 

When an interrupt request is sensed by external interrupt-control logic, the request is signaled to the 
processor by the control logic, the processor acknowledges the interrupt request (during which 
sequence the controller passes the interrupt vector), and the processor services the interrupt as 
specified by the vector. The hardware mechanism is described above in the Hardware Architecture 
chapter. 

An interrupt-acknowledge sequence, shown in Figure 39, consists of two back-to-back locked reads 
on NexBus^, where the operation type (NxAD<48:46>) is 000 and the byte enable bits BE<7:0>* = 
11111110. The first (synchronizing) read is used latch the state of the interrupt controller. It is 
indicated by NxAD<2> = 1 (I/O-byte address 4). The second read is used to transfer the 8-bit 
interrupt vector on NxAD<7:0> to the processor, which uses it as an index to the interrupt service 
routine. This read is indicated by NxAD<2> = 0 (I/O-byte address 0). During these two reads only 
the least significant bit of the address field is driven to a valid state. The most significant bits are 
undefined. After the interrupt is serviced, the request is cleared and normal processing resumes. 
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NexBusS Halt and Shutdown Operations 

Halt and shutdown operations are signaled on the NexBus^ by driving 001 on NxAD<48:46> during 
the address/status phase, as shown in Figure 40. The halt and shutdown conditions are distinguished 
from one another by the address that is simultaneously signaled on the byte-enable bits, BE<7:0>* 
on NxAD<39:32>. The processor does not generate a data phase for these operations. 



Type of 
Bus Cycle 


NxAD<48> 

M/IO* 


NxAD<47> 

D/C* 


NxAD<46> 

W/R* 


NxAD<39:32> 

BE<7:0>* 


NxAD<31:3> 


NxAD<2> 


Halt 


0 


0 


1 


11111011 


undefined 


0 


Shutdown 


0 


0 


1 


11111110 


undefined 


0 



Figure 40 Halt and Shutdown Encoding 

For the halt operation, the processor places an address of 2 on the bus, signified by BE<7:0>* bits 
(NxAD<39:32>) = 11111011. NxAD<2> = 0 and NxAD<31:3> are undefined. After this, the 
processor remains in the halted state until NMI*, RESETCPU*, or RESET* becomes active. 

For the shutdown operation, the processor places an address of 0 on the bus, signified by BE<7:0>* 
bits (NxAD<39:32>) = 11111110. NxAD<2> = 0 and NxAD<31:3> are undefined. An external 
system controller should decode the shutdown cycle and assert RESETCPU*. After this, the 
processor performs a soft reset, RESETCPU*; that is, the processor is reset, but the memory 
contents, including modified cache blocks, are retained. 

Because the Nx586 processor has a 64-bit data bus rather than a 32-bit data bus, eight total byte- 
enable bits (BE<7:0>*) are specified for quadword wide bus. 
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Obtaining Exclusive Use Of Cache Blocks 

The processor can obtain ownership of a cache block either preemptively or passively. Preemptive 
ownership is gained by asserting OWN* during the address/status phase of a read or write operation. 
Whenever the processor needs to write a cache block that is either cached in the shared or invalid 
state, it performs a preemptive read-to-own operation by asserting OWN* during a single-qword 
write or four-qword block read. 

Passive ownership is normally gained when the processor performs a block read, because other 
NexBus^ caching devices must snoop block reads. If any part of a block addressed by the processor's 
read operation resides in another NexBus^ device's cache, regardless of state, that device asserts 
SHARE* after the assertion of GALE but not later than the clock during which the first qword of the 
block is transferred. SHARE* remains asserted through the entire data transfer. If the processor sees 
GSHARE negated during a block read when it samples the first qword of the block, it knows that it 
has the only copy. It can therefore cache the block in the exclusive state rather than the shared state, 
if and only if OWNABL is asserted by system logic. 

If another NexBus^ caching device is unable to meet this timing in the fastest possible case, it must 
assert XHLD* to delay the operation until it is able to perform the cache check. While it is possible 
to put a caching device on NexBus^ that is unable to check its cache and report SHARE* correctly, 
but instead always asserts SHARE*, this has a very negative effect on system efficiency. It is also 
possible to design a device that invalidates its cache block during any block read hit, in which case 
only the efficiency of that one device is impaired. 

If the processor addresses a non-cacheable block on a system bus other than NexBus-\ the systems 
logic interface between the NexBus^ and the other system bus (called the alternate-bus interface ) 
must indicate this by negating GBLKNBL, and it may not perform block reads or writes to such a 
block. If the block on the other bus is cacheable, it can only be cached in the shared state, since 
standard system buses (such as VL bus and ISA bus) do not support the MESI caching protocol, and 
it is not possible to cache their memory addresses in the exclusive state. 

The OWNABL signal from system logic is used to indicate cacheability of locations on other system 
buses. Whenever OWNABL is negated during a bus operation, the processor will not cache the block 
in the exclusive state even if the processor asserted OWN*; instead, it may cache the block in the 
shared state if other conditions permit it. 

GBLKNBL and GSHARE must be asserted by system logic at the same time that OWNABL is 
negated. The timing of these three signals is identical: they should be valid whenever GXACK is 
asserted. They may be (but need not be) asserted ahead of XACK*, and may (but, except for 
GSHARE, need not) be held one clock after the negation of XACK*. This timing differs from that of 
GSHARE, since when OWNABL is asserted GSHARE is not required to be valid until the clock 
following the negation of GXHLD — i.e., coincident with the data transfer. 
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NexBus 5 Intervenor Operations 

The examples given above assume that the addressed data does not reside in a modified cache block. 
When an operation by another NexBus^ master results in a cache hit to a modified block in the 
processor, the processor intervenes in the operation by asserting DCL*. The timing for DCL* is the 
same as that for SHARE*: the NexBus master samples GDCL on the same clock in which it samples 
NexBus^ data. An asserted GDCL indicates to the master that data cached by the processor is 
modified. To meet the fastest timing requirements, the processor asserts DCL* no later than the third 
clock following the assertion of GALE. If a MESI write-back caching device is unable to determine 
in a timely manner whether a transaction hits in its cache, it must assert XHLD* to delay the transfer. 

If a block write operation by another master hits a modified cache block in the processor, the 
processor does not assert DCL*, since such a block write replaces all of a cache block. Instead, the 
processor invalidates the block. 

An addressed slave that sees GDCL asserted during the first qword transfer of an operation must 
abort the operation by negating GXACK. It may then perform a block write-back starting with the 
first qword. Immediately after the operation is completed, as determined by the negation of GXACK, 
the NexBus^ Arbiter must grant the bus to the intervenor by asserting GNT*. The arbiter must not 
grant the bus to any other requester, even if the previous master has asserted AREQ* and/or LOCK*, 
because DCL* has absolutely the highest priority. Upon seeing GNT* asserted, the intervenor 
(whether the processor or another master) immediately updates the memory by performing a block 
write, beginning at the qword address specified in the original operation. The intervenor negates 
DCL* before performing the first data transfer, but not before it asserts ALE*. During this memory 
update, the master must sample the data it requested (if the operation was a read) as it is sent to 
memory on NexBus^ by the intervenor. If the master is not ready to sample the data, it can assert 
XHLD*, as can both the intervenor and the slave; all three parties to the operation examine GXHLD 
to synchronize the data transfer. 



Modified Cache-Block Hit During Single-Qword Operations 

During single-qword reads that hit in a modified cache block, the NexBus^ sequence looks like a 
normal single-qword read from the memory followed by a block write by the intervenor. Figure 41 
illustrates the timing. The fastest time is shown for the operation, while both the fastest and slowest 
possible times are shown for the leading edge of GDCL. For a slow device intervening in a fast 
operation, GDCL is available to be sampled on the same clock as the first qword of data is available. 

In Figure 41, two sources are shown for GALE and NxAD<63:0>, and one source (Sp) has a 
subscript. The source is the chip or logic that outputs the signal. The subscript for the source 
indicates the chip or logic that originally caused the change in the signal. 

During single-qword writes, the master with the modified cache block asserts DCL* to indicate that 
the single write will be followed by a block write. If the single write included only some of the bytes 
of the qword, the intervenor records this fact, and during the subsequent block write it outputs byte- 
enable bits indicating the other bytes of the qword. For example, if the byte-enable bits of the single 
write were 00000111, the intervenor outputs 11111000. In other words, the intervenor updates only 
those bytes that were not written by the master. Except for such intervening write-back operations, 
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block writes must have all byte-enable bits asserted (00000000). During block write-backs, byte- 
enable bits apply only to the first qword, so all bytes of the final three qwords are written. 




Modified Cache-Block Hit During Four-Qword (Block) Operations 

As described above for single-qword operations, a block read by another NexBus^ master may hit a 
modified cache block in the processor. When this happens, the processor responds exactly as for a 
single-qword operation: it asserts DCL*, waits for the assertion of GNT* following the negation of 
GXACK, and proceeds with a block write-back. It writes the entire four-qword block back to 
memory. The original bus master must sample the data in this second block operation while it is 
transferred to memory. The master may insert wait states by asserting XHLD*. Since the processor, 
as intervenor, begins its write-back with the address requested by the master, if the original block 
read is a four-qword operation, the master can intercept the data as it is transferred to memory and 
find it in the expected order. 

Block writes can hit in a modified or exclusive cache block only if the operation was initiated by the 
DMA action of a disk controller, not by the processor. Since only complete block writes are 
permitted, no write-back is required and the processor invalidates its cache block. 
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For Electrical Data See Document "Nx586 Electrical Specifications" 
Order # NxDOC-ESOOl-Ol-W 
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Access— A bus master is said to "have access to a bus" when it can initiate a bus cycle on that bus. 
Compare bus ownership. 

Adapter — A central processor, memory subsystem, I/O device, or other device that is attached to a 
slot on the NexBus, VL-Bus, or ISA bus. Also called a slot. 

Aligned — Data or instructions that have been rotated until the relevant bytes begin in the least- 
significant byte position. 

Allocating Write — A read-to-own (read for exclusive ownership of cacheable data) followed by a 
write to the cache. 

Arbiter — A resource-conflict resolver, such as the NexBus arbiter. 

Asserted — For signals, "asserted" means driven to the state which asserts the description of the 
signal. 

Active High — The signal or memory bit drive to its "asserted" state which is logically or physically 
high. For a memory bit, this would be a "1". For a signal, this would be near VCC voltage level. 

Active Low — The signal or memory bit drive to its "asserted" state which is logically or physically 
low. For a memory bit, this would be a "0". For a signal, this would be near GND voltage level. 

b— Bit. 

B — Byte. 

Bandwidth — The number of bits per second that can be processed by a memory, arithmetic unit, 
input/output processor, or communication system. 

Bank — In a cache, same as set and way. In main memory, a qword-wide group of addressable 
locations. 

Branch Prediction — The use of history, statistical methods, or heuristic rules to predict the outcome 
of conditional branches. 

Buffer — A fraction of real memory or a group of registers that serve as a buffer for data flowing to 
and from auxiliary memory. 

Bus Cycle — A complete transaction between a bus master and a slave. For the Nx586 processor, a 
bus cycle is typically composed of an address and status phase, a data phase, and any necessary idle 
phases. Also called a bus operation, or simply operation. 
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Bus Operation — Same as bus cycle. 

Bus Ownership — A bus is said to be owned by a master when the master can initiate cycles on the 
bus. The master to which bus ownership is granted controls only its own interface with the arbiter. 
The arbiter, on behalf of that master, acts as a master on the other buses in the system. It does this so 
as to support the master in the event that a bus-crossing operation is requested. Compare access. 

Bus Phase — Part of bus cycle that lasts one or more bus clocks. For example, it may be a transfer of 
address and status, a transfer of data, or idle clocks. 

Bus Sequence — A sequence of bus cycles (or operations) that must occur sequentially due to their 
being explicitly locked by the continuous assertion of the master's AREQ* and/or LOCK* signals, or 
implicitly locked by the GDCL signal. 

Cache Block — A 32-byte unit of data in a cache. The Nx586 processor's caches are organized around 
such blocks. Each cache block has an associated tag and MESI-protocol state. Cache blocks can be 
fetched atomically as a contiguous group of 32-bytes or in eight-byte subblock units. Compare cache 
line. 

Cache-Block Tag — The high-order address bits of a cache block that identifies the area of memory 
from which it was copied. During a cache lookup, the high-order address bits of the processor's 
operand is compared with the tags of all blocks stored in the cache. 

Cache Coherence — The protocol among multiprocessors with private caches that assures that each 
variable in the shared memory space receives writes in a serial order, and no processor sees that 
sequence of values in any other order. 

Cache Hit — An access to a cache block whose state is modified, exclusive, or shared (i.e., not 
invalid). Compare cache miss. 

Cache Line — If a cache block can be fetched atomically (rather than in subblock units), the concepts 
of cache block and cache line are identical. However, in the Nx586 processor, cache blocks are often 
fetched in eight-byte subblock units, leaving only parts of the cache block valid. Compare cache 
block. 

Cache Lookup — Comparison between a processor address and the cache tags and state bits in all four 
sets (ways) of a cache. 

Cache Miss — An access to a cache block whose state is invalid. Compare cache hit. 

Caching Master — A bus master that internally caches data originated elsewhere. The caching master 
must continually monitor the bus to guarantee cache coherency. Masters on buses other than the 
NexBus can maintain caches, but they must be write-through (not write-back) caches. 

Conditional Branch — A computer instruction that alters the sequence of execution if a condition is 
true, and otherwise falls through to the next instruction in sequence. 

Clean— Same as exclusive. 

Clock Cycle — Unless otherwise stated, this a processor-clock cycle rather than a bus-clock cycle. 
The Nx586 processor's clock runs at twice the frequency of the NexBus clock (NxCLK). The level- 1 
cache runs at the same frequency as the processor clock. The level-2 cache runs at the same 
frequency as the NexBus clock (NxCLK). 

Clock Phase — One-half of a processor clock cycle. 
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Cycle — See bus cycle, clock cycle, bus phase, and clock phase. 

D Cache — The level- 1 (LI) data cache. 

Device — Same as adapter. 

Dirty — Same as modified. 

Dword — A doubleword. A four-byte (32-bit) unit of data that is addressed on an four-byte boundary. 
Also called a dword (doubleword). 

Exclusive — One of the four states that a 32-byte cache block can have in the MESI cache-coherency 
protocol. Exclusive data is owned by a single caching device and is the only known-correct copy of 
data in the system. Also called clean data. When exclusive data is written over, it is called modified 
(or dirty) data. 

Floating Point Execution Unit — The Floating Point Execution Unit. The logic in the Floating Point 
Execution unit is integrated into the parallel pipeline of the Nx586 processor. 

Flush — (1) To write back a cache block to memory and invalidate the cache location, also called 
write-back and invalidate, or (2) to invalidate a storage location such as a register without writing the 
contents to any other location. This is an ambiguous term that is best not used. 

Functional Unit — The Decode Unit, Address Unit, Integer Unit, Floating Point Coprocessor, or 
Cache and Memory Unit. 

Group Signal — A NexBus control signal that represents the logical OR of several inputs. These 
signals typically have signal names that begin with the letter "G". 

I Cache — The level- 1 (LI) instruction cache. 

Invalid — One of the four states that a 32-byte cache block can have in the MESI cache-coherency 
protocol. Invalid data is not correctly associated with the tag for its cache block. 

Invalidate — To change the state of a cache block to invalid. 

LI or Level-1 — The level-1 or primary cache is located on the Nx586 processor chip. 

L2 or Level 2 — The level-2 or secondary cache is located in SRAM connected to the processor's 
SRAM bus and controlled by logic on the Nx586 processor. 

Line — See cache block. 

Main Memory — See memory. 

Master — The Master is a device on the NexBus that initiates a transaction. 

Memory — A RAM or ROM subsystem located on any bus, including the main memory most directly 
accessible to a processor. Also called main memory. 

MESI — The cache-coherency protocol used in the Nx586 processor. In the protocol, cached blocks in 
the L2 write-back cache can have four states (modified, exclusive, shared, invalid), hence the 
acronym MESI. See modified, exclusive, shared, and invalid. 

Modified Write-Once Protocol — The cache-coherency protocol used in the Nx586 processor. See 
MESI. 
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Modified — One of the four states that a 32-byte cache block can have in the MESI cache-coherency 
protocol. Modified data is exclusive data that has been written to after being read from lower-level 
memory, and is therefore the only valid copy of that data. Also called dirty or stale. 

MWO — See modified write-once protocol. 

NB — Same as NexBus. 

Negated — For signals, "negated" means driven to the state which de-asserts the description of the 
signal. Or the opposite of "asserted". 

NexBus — A 64-bit synchronous, multiplexed bus defined by NexGen. 

No-Op — A single-qword operation with BE<7:0>* all negated. No-ops address no bytes and do 
nothing except consume processor cycles. 

Nx586 — The Nx586 processor (CPU). 

NxVL — A NexBus system controller chip that supports a Nx586 processor, main memory, 82C206 
peripheral controller, VL-Bus, and ISA bus. 

Octet — A unit of data consisting of eight bytes and addressed on an eight-byte boundary. 

Operation — See bus operation. 

Owned — A cache block whose state is exclusive (owned clean) or modified (owned dirty). See also 
bus ownership. 

Ownership — See bus ownership. 

Peripheral Controller — A chip that supports interrupts, DMA, timer/counters, and a real-time clock. 
Phase — See bus phase and clock phase. 

PLL — Phase-locked loop. 

POST — Power On Self Test. This procedure is performed when power is first applied to check the 
functionality of the system.. 

Present — Same as valid. 

Processor — Unless otherwise specified, refers to aNx586 processor. 

Processor Clock — The Nx586 processor clock. See clock cycle. 

Qword — A quadword. A eight-byte unit of data that is addressed on an eight-byte boundary. 

Register Renaming — A technique used in processor design that assigns idle registers to serve in the 
place of program specified registers in order to avoid conflicts that could stall pipeline flow 
momentarily. 

RISC — Reduced Instruction-Set Computer. A computer in which all instructions are simple 
instructions that take one cycle to execute, except possibly for delays introduced by conditional 
branches and cache misses. 

Scalar Operation — Any operation performed on individual data. 

Scalar Processor — A processor whose basic operations manipulate individual data elements rather 
than vectors or matrices. 
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Set — In a cache, one of the degrees of associativity. The group of cache blocks in such a set. Same as 
bank and way. 

Shared — One of the four states that a 32-byte cache block can have in the MESI cache-coherency 
protocol. Shared data is valid data that can only be read, not written. 

Snoop — To compare an address on a bus with a tag in a cache, so as to detect operations that are 
inconsistent with cache coherency. 

Snoop Hit — A snoop in which the compared data is found to be in a modified state. Compare snoop 
miss. 

Snoop Miss — A snoop in which the compared data is not found, or is found to be in a shared state. 
Compare snoop hit. 

Source — In timing diagrams, the left-hand column of the diagram indicates the "source" of each 
signal. This is the chip that originated the signal as an output. When signals are driven by multiple 
sources, all sources are shown, in the order in which they drive the signal. The source of a signal that 
takes on a different name as it crosses buses through transceivers is shown as the transceivers 
overwhich the signals cross, subscripted with a symbol indicating the logic that originally output the 
signals. The source of group-ORed signals (such as GXACK) is likewise subscripted with a symbol 
indicating the logic that originally output the activating signal (such as XACK*). 

Stale — Same as modified. 

System Bus — A bus to which the NexBus interfaces. The system buses include the VL-Bus, PCI-Bus 
and ISA bus. 

System Controller — The device or logic that provides NexBus arbitration and interfacing to main 
memory and any other buses in the system. 

Superscalar— A computer architecture in which multiple scalar instructions are decode in each clock 
cycle sot that the instruction completed per cycle exceeds 1 .0. 

T-Byte — An 80-bit floating-point number. 

Word — A two-byte (16-bit) unit of data. 

Write-Back Cache — A cache in which WRITEs to memory are stored in cache and written to 
memory only when a rewritten item is removed from cache. 

Write-Through Cache — A cache in which WRITEs to memory are recorded concurrently both in 
cache and in main memory. The result is that the main memory slways contains valid data 
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82C206, 59 
Access, 83. 

Active-High Signals, vi 
Active-Low Signals, v 
AD, 24 
Adapter, 83 
ADDRESS, 25 
Address and status phase, 24 
address and status phase, 24 
Address Latch Enable, 19 
Address phase, 24, 25, 64, 71 
Address Unit, 47 
Addressing, vi 
ALE*, 4, 19, 74 
Aligned, 83 
Allocating Write, 83 
Alternate bus, 18 
Alternate-Bus Request, 17 
ANALYZEIN, 33 
ANALYZEOUT, 33 
Arbiter, 17, 83 
Arbitration, 64 
Architecture, 41 
AREQ*, 17, 64, 74 
asterisk, v 
B, vi, 83 
b, vi, 83 
Bank, 83 

Basic System Diagram, 42 
BE, 25, 26,71,72 
Binary compatibility, 1 
BLKSIZ, 28, 71 
Block Size, 28 



Buffered Address and Data Bus, 4 
Bus, 45 

Bus Arbitration, 4 

Bus Cycle, 83 

Bus Cycle Types, 27 

Bus Lock, 1 8 

Bus Operations, 61, 84 

Fast NexBus5 Single-Qword Read with a 
delayed GXACK, 67 
Fastest NexBus Single-Qword Read, 65 
Fastest NexBus Single-Qword Write, 65 
Fastest NexBus5 Single-Qword Read, 67 
Fastest NexBus5 Single-Qword Write, 70 
Interrupt Acknowledge Cycle, 72 
NexBus5 I/O, 71 
NexBus5 Intervenor, 74 
NexBus5 Single-Qword Read Hits Modified 
Cache Block, 75 

NexBus5 Single-Qword Read with Wait 

States using a delayed GXACK, 68 
NexBus5 Single-Qword Read with Wait 

States using GXHLD only, 69 
NexBus5 Single-Qword Write with Wait 

States, 70 
Bus Ownership, 84 
Bus Phase, 84 
Bus Sequence, 84 
Bus Signals, vi 
Bus Structure, 41 
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Buses 

Alternate, 18 
Cycles, 61 

Internal 64-bit Execution Unit Bus, 45 
NexBus, 42 
NxAD, 4, 24 
Operations, 61 
Snooping, 54 
Structure, 41 
Byte Enables, 25 
byte-enable bits, 75 
CACHBL, 28 
Cache, 47 

Cache and Memory Subsystem, 52 

Coherency, 53, 54 

Data, 52 

Instruction, 52 

Level- 1, 52 

Level-2, 29, 45, 52 

Level-2 Configuration, 53 

States, 55 

Cache and Memory Subsystem, 52 

Cache and Memory Unit, 47 

Cache Block, 84 

Cache Coherency, 53, 54 

Cache Control, 21 

Cache fills, 7 1 

Cache Hit, 84 

Cache Line, 84 

Cache Lookup, 84 

Cache Miss, 84 

Cache-Block Tag, 84 

Cache-Hit Reads, 51 

Cacheable, 28, 71 

Caching Master, 84 

CADDR, 29, 61 

CBANK, 29, 61 

CDATA, 29, 61 

CKMODE, 30, 53, 60 

Clean, 84 

Clock Cycle, 84 

Clock Input Reference, 3 1 

Clock Mode, 30 

Clock Mode Select, 3 1 

Clock Output Reference, 3 1 



Clock Phase, 84 
Clock Phase 1 , 30 
Clocks, 30, 46 
Cycles, 84 
Generation, 60 
LI -cache, 46 
L2-cache, 46 
Modes, 60 
NexBus, 46 
Processor, 46 
COEA*, 29 
COEB*, 29 
Compatibility, 1 
CWE, 29 
Cycle, 85 
Cycle Control, 19 
D Cache, 52, 85 
D/C*, 27 
Data, vi 

Data or Code*, 27 
Data phase, 24, 66, 7 1 
DCL*, 21,57, 59, 74, 75 
Decode Unit, 47 
DEVICE, 26 
Device, 85 
Dirty, 85 

Dirty Cache Line, 21, 57 
DMA, 75 

doubleword, vi, 25 
Dword, 85 
dword, vi 

Dword Address, 25 
Electrical Data, 77 
Endian Convention, vi 
Exclusive, 54, 58, 73, 85 
External Phase Inputs, 60 
External PLL Mode 
Clock Input, 30 
NxCLK, 30 
PHE1, 30 

Skewed NxCLK, 31 
XREF, 31 

External Processor Clock, 53, 60 
Fast NexBus5 Single-Qword Read with a 
delayed GXACK, 67 
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Fastest NexBus Single-Qword Read, 65 
Fastest NexBus Single-Qword Write, 65 
Fastest NexBus5 Single-Qword Read, 67 
Fastest NexBus5 Single-Qword Write, 70 
Features, 3 

Floating Point Execution Unit, 47, 85 
Floating Point Interrupt Request, 32 
Floating-Point Execution Unit, 45 
Flush, 85 

Four-Qword Block Read (Cache-Block Fill), 26 
Four-Qword Block Write, 26 
Functional Unit, 85 
G, vi 

GALE, 4, 19, 24, 66, 73, 74 
Gate Address 20, 32 
GATEA20, 4, 18, 32 
GBLKNBL, 21, 28, 57, 66,71 
GDCL, 21, 57, 74 
Global Reset (Power-Up Reset), 32 
Global Write Enable, 29 
GNT*, 17, 64, 74, 75 
Grant NexBus, 17 
GREF, 33 

Ground Reference, 33 
Group Address Latch Enable, 19 
Group Block (Burst) Enable, 21 
Group Dirty Cache Line, 21 
Group Shared Data, 22 
Group Signal, 85 
Group Signals, 4 

Group Transfer Acknowledge, 20 
Group Transfer Hold, 20 
Group Try Again Later, 19 
GSHARE, 22, 56, 57, 58, 73 
GTAL, 19 

GXACK, 20, 24, 28, 58, 66, 67, 75 
GXHLD, 20, 24, 66, 67 
Halt, 27, 72 

High Non-Overlapping Time, 30, 60 
HROM, 33 
I Cache, 52, 85 
I/O, 26 

I/O Data Read, 27 
I/O Data Write, 27 
I/O operations, 64 



FO Reads, 51 
I/O space, 71 
INT instruction, 59 
Integer Unit, 47 

Internal 64-bit Execution Unit Bus, 45 
Internal Architecture, 47 
Internal PLL Mode 
IREF, 31 
NxCLK, 30 
Interrupt, 32 

Interrupt Acknowledge, 27 
Interrupt Acknowledge Cycle, 72 
interrupt vector, 7 1 
Interrupts, 59 
Intervenor operation, 59 
INTR*, 4, 18, 32, 59 
Invalid, 54, 57, 85 
Invalidate, 85 
IPC, 59 
IREF, 31 
JEDEC 
Bottom, 16 
Pinouts, 11 
Top, 15 
k, vi 
LI, 85 

Ll-cache clock, 46 
L2, 85 

L2 Cache Address, 29 
L2 Cache Bank, 29 
L2 Cache Data, 29 
L2 Cache Output Enable A, 29 
L2 Cache Output Enable B, 29 
L2 Cache Write Enable, 29 
L2-cache clock, 46 

Level-2 Asynchronous SRAM Accesses, 61 
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Level-2 Cache, 45 
Asynchronous, 53, 61 
Asynchronous Line Fill, 62 
Asynchronous READ, 61 
Asynchronous WRITE, 61 
Asynchrounous, 45 
SRAM MODE, 53 
Synchronous, 53, 63 
Synchronous READ, 63 
Synchronous WRITE, 63 
Synchrounous, 45 
WE*, 29 

Level-2 Cache Configuration, 53 
Level-2 Cache Signals, 29 
Level-2 Synchronous SRAM Accesses, 63 
Line, 85 

LOCK*, 18, 64, 74 
Low Non-Overlapping Time, 30, 60 
M, vi 
M/IO*, 27 
Main Memory, 85 
Maskable Interrupt, 32 
Master ID, 26 
MCM, 2 
Mechanical 
Bottom, 82 
Side, 81 
Top, 80 

Mechanical Data, 79 
Memory, 26, 85 
Memory Code Read, 27 
Memory Data Read, 27 
Memory Data Write, 27 
Memory operations, 64 
Memory or I/O*, 27 
Memory Reads, 5 1 
Memory Reads on NexBus, 51 
Memory-Mapped I/O Reads, 51 
MESI, 85 

MESI cache-coherency protocol, 54 
MID, 26 

Modified, 54, 59, 86 

Modified Cache-Block Hit During Four-Qword 
(Block) Operations, 75 



Modified Cache-Block Hit During Single- 
Qword Operations, 74 
modified write-once, 54 
Modified Write-Once Protocol, 85 
modified, exclusive, shared, or invalid (MESI), 
54 

Multi-Chip-Module, 2 
MWO, 54, 86 
Names, v 

NB, 86 

NC, 33 

NexBus, v, 17, 42, 64, 86 
NexBus Address and Status, or Data, 24 
NexBus Arbitration and Address Phase, 64 
NexBus Clock, 30 
NexBus clock, 46 
NexBus Request, 17 
NexBus Slot ID, 18 
NexBus5, v, 17, 42, 44, 64 
NexBus5 Bus Operations 
Halt and Shutdown, 72 
NexBus5 Cache Line Memory Operations, 7 1 
NexBus5 Halt and Shutdown, 72 
NexBus5 I/O Operations, 71 
NexBus5 Interrupt- Acknowledge, 7 1 
NexBus5 Intervenor Operations, 74 
NexBus5 Memory Operations 
Cache Line, 7 1 
Single-Qword, 66 

NexBus5 Single-Qword Memory Operations, 66 
NexBus5 Single-Qword Read Hits Modified 
Cache Block, 75 

NexBus5 Single-Qword Read with Wait States 
using GXHLD only, 69 

NexBus5 Single-Qword Read with Wait States 
using a delayed GXACK, 68 
NexBus5 Single-Qword Write with Wait States, 
70 

NMI*, 4, 18, 32, 59 
No-Op, 86 

Non-Maskable Interrupt, 32 
Notation, v 
NPIRQ*, 32 
NREQ*, 17, 64 
Nx586, 86 
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Nx586 Features and Signals, 1 
Nx586 Processor with Floating-Point 
Execution Unit, 2 
NxAD, 24 
NxAD bus, 4 
NxCLK, 30, 46, 61 
NxMC, v 
NxPCI, v 
NxVL, v, 86 
Octet, 86 

Operating Frequencies, 46 

Operation, 72, 86 

Order of Transactions, 5 1 

OWN*, 22, 56, 57, 58 

OWNABL, 22, 56, 57, 58, 66, 73 

Ownable, 22, 56 

Owned, 86 

Ownership, 86 

Ownership Request, 27, 56 

P4REF, 33 

Paged devices, 21 

passive exclusive use, 73 

Peripheral Controller, 86 

PGA Package side view, 81, 82 

PGA Package top view, 80 

PHI, 61 

PH2, 61 

Phase, 86 

Phase-Locked Loop, 60 
PHE1, 30, 60 
PHE2, 30 
PLL, 60, 86 
PLL Analog Power, 3 1 
PLL Mode 
External, 30, 3 1 
Non-Overlapping Time, 30, 60 
NxCLK, 30 
PHE1, 30 
PHE2, 30, 60 
XSEL, 31 
POPHOLD, 33 
Power Reference, 33 
preemptive exclusive use, 73 
Present, 86 
Processor, 86 



Processor Clock, 46, 86 
Processor Clock Phase 1,31 
Publications, vii 
PULLHIGH, 33 
PULLLOW, 33 
quad word, vi, 25 
Qword, 86 
qword, vi 

Qword Address, 25 
Read Order, 5 1 
read-modify- writes, 51 
References, vii 
Reserved, 25, 27, 28, 33 
Reserved Bits and Signals, vi 
Reset, 32 

Reset CPU (Soft Reset), 32 
RESET*, 18, 32 
RESETCPU*, 18, 32, 72 
RISC86, 3 
SCLKE, 30, 53 
Serial In, 33 
Serial Out, 33 
SERIALIN, 33 
SERIALOUT, 33 
Set, 87 

SHARE*, 22, 56, 58, 73, 74 
Shared, 54, 58, 87 
Shared Data, 22, 56 
Shutdown, 27, 72 
signal organization, 4 
Signals, v 
Arbitration, 17 
Cache Control, 21 
Clocks, 30 
Cycle Control, 19 
Interrupt, 32 
Level-2 Cache, 29 
NexBus, 17 

NexBus Address and Data, 24 
NexBus5, 17 

NexBus5 Address and Data, 24 
Reserved, 33 
Reset, 32 
Test, 33 

Single Qword Read or Write, 26 
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Sirty, 59 

SLOTID, 4, 18, 26 
SLOTID 0000, 18 
Snoop, 87 

Snoop Enable, 28, 57 
Snoop Hit, 87 
Snoop Miss, 87 
Snooping, 21, 54 
SNPNBL, 28, 57 
Source, vi, 87 
SRAM, 45 

SRAMMODE, 29, 53 
Stale, 59, 87 
State Transitions, 55 
Storage Hierarchy, 48 
Synchronous signals, 4 
System Bus, 87 
System Controller, 87 
T-Byte, 87 
Test, 33 

Test Phase 1 Clock, 33 
Test Phase 2 Clock, 33 
Test Power, 33 
TESTPWR*, 33 
Timing Diagrams, v 
TPH1, 33 
TPH2, 33 

Transaction Ordering, 5 1 
Transceiver NexBus Clock Enable, 23 
Transceiver-to-NexBus Output Enable, 23 
Transceiver-to-NxAD-Bus Output Enable, 23 
Transceivers 
External, 23, 42 
Internal, 44 
Systems Logic, 43 
XBCKE*, 23 
XBOE*, 23 
XCVERE*, 23, 42 
XNOE*, 23 
transceivers, 23 
Transfer Acknowledge, 19 
Transfer Hold, 20 
Transfer Type, 26 
Try Again Later, 19 
VDDA, 31 



video adapters, 21 
W/R*, 27 
WE*, 29 
Word, 87 
word, vi 

Write or Read*, 27 
Write Order, 5 1 
write queue, 52 
Writes, 51 

x86 Architecture, vii 
XACK*, 19, 66, 67 
XBCKE*, 23 
XBOE*, 23 
XCVERE*, 23, 42, 44 
XHLD*, 20, 67, 73, 74, 75 
XNOE*, 23 
XPH1, 31 
XREF, 31 
XSEL, 31,60 
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